U.S. Department of Transportation
Federal Highway Administration
1200 New Jersey Avenue, SE
Washington, DC 20590
202-366-4000


Skip to content
Facebook iconYouTube iconTwitter iconFlickr iconLinkedInInstagram

Federal Highway Administration Research and Technology
Coordinating, Developing, and Delivering Highway Transportation Innovations

 
REPORT
This report is an archived publication and may contain dated technical, contact, and link information
Back to Publication List        
Publication Number:  FHWA-HRT-14-057    Date:  February 2018
Publication Number: FHWA-HRT-14-057
Date: February 2018

 

Safety Evaluation of Access Management Policies and Techniques

CHAPTER 6. ANALYSIS

GLM techniques were applied to estimate the models. A negative binomial error structure was specified, following the state of the art in modeling crash data. The negative binomial structure is now recognized as more appropriate for crash counts than the normal distribution assumed in conventional regression modeling. Crash counts per year by crash type were used as estimates of the dependent variable, while corresponding roadway characteristics and traffic volume data were used as the independent variables.

Preliminary models were developed for each of the four regions (Northern California, Southern California, Minnesota, and North Carolina). Within each land use type (i.e., mixed-use, commercial, and residential), each corridor was identified as located within an urban, suburban, or urbanizing area. All area types were combined within the respective land use type to develop reliable models. A factor variable was included in each model to account for any differences attributable to area type, but the differences were minor and not statistically significant. This is not to say there is no difference in crash patterns among area types, but the data did not allow quantification of this relationship. It is also likely that area type is better described by other variables in the model. For example, the traffic volume, number of lanes, access density, and frontage development can be used to describe the characteristics of a corridor and are more quantitative than defining a corridor as “urban, suburban, or urbanizing.” Therefore, area type was not included in the final models.

The first step in the analysis process was to develop a model using only AADT as a predictor variable and both the number of years and corridor length as offset variables. The general form of this model is given by the equation in figure 17.

Figure 17. Equation. General form of crash prediction model. Crashes equals years times segment length times alpha times AADT to the beta power.Figure 17. Equation. General form of crash prediction model. Crashes equals years times segment length times alpha times AADT to the beta power.

Figure 17. Equation. General form of crash prediction model.

 

Where:
α = constant term estimated from the regression model.
β = estimated coefficient from the regression model for AADT.

The general model was successfully developed in each scenario, and there were no apparent outliers or errors in the data. Additional variables were then investigated. This investigation involved entering each variable one at a time such that only AADT and the new variable of interest were included. The estimated parameter and its standard error were examined to determine the following:

Alternate model forms were explored using the procedure described by Hauer and Bamfo.(18) To summarize, a model with AADT as the only explanatory variable was first estimated. Then, for a variable of interest, the model was used to predict the number of crashes for each site. The sum of observed crashes for all sites was then divided by the sum of predicted crashes for all sites with the same value of the explanatory variable, or range of values in the case of continuous variables. A plot of the observed to predicted ratios for the range of the explanatory variables is then used to examine trends that would suggest an appropriate model form for the explanatory variable. It was determined that the exponential model form is appropriate because of its flexibility, and this form was retained for development of the final models.

Pearson correlation statistics were computed for each dependent and independent variable. The correlation matrix was not the primary driver of model building but helped to identify those variables most associated with the different crash types. This also helped to identify independent variables that were highly correlated. High correlation between independent variables can be problematic in developing models. Specifically, the inclusion of highly correlated variables can lead to illogical results. While omitting a highly correlated variable may help avoid this issue, doing so limits the practicality of the results if one is interested in the safety impacts of the omitted variable. In this study, the research team estimated a series of models with various combinations of variables as a reasonable compromise between statistical efficiency and practicality. This addressed issues related to correlation and provided information for all variables of interest.

The next step was to enter the most promising variables into the model in combinations. Some variables were dropped because the effect was not statistically significant or because the direction of effect was illogical. The latter case was likely due to highly correlated variables in the model. In some cases, it was necessary to choose between two or more variables, removing highly correlated variables from the model. The main factors in these decisions were improvement in overall model fit and selection of the variables that were most likely of interest in the application of the model to AM.

Following the development of preliminary models for each region, feedback was requested from the steering committee on which variables with promise were most desired in the models. Not all variables could be included in the models owing to both sample size limitations and correlation between potential explanatory variables. Therefore, the steering committee was asked to identify the explanatory variables that would be most useful to practitioners. The following variables were indicated to be most important for practical use according to the feedback:

Of these variables, all were included in various models except for posted speed limit. It should be noted that vehicle speed is related to the severity of a crash, but the posted speed limit was not included in these models because it was not statistically significant after accounting for other variables. Posted speed tends to be highly correlated with other variables such as access density and frontage type. This is likely the reason it could not be included in the final models. It is also possible that posted speed does not provide an accurate representation of the actual speeds (i.e., operating speed may be a better alternative for capturing the impacts of speed).

Other variables were also explored for potential inclusion in the models. For example, the number of lanes is a common variable to describe the characteristics of a roadway. In this case, the number of lanes was allowed to vary throughout a corridor (i.e., a new corridor was not defined if the number of lanes changed). This helped to avoid issues related to frequent section breaks (e.g., low crash counts associated with short segments). To describe the variation in lanes within a corridor, the following variables were defined:

The following section discusses the final modeling results, and the final models are presented in appendix C.

 

 

Federal Highway Administration | 1200 New Jersey Avenue, SE | Washington, DC 20590 | 202-366-4000
Turner-Fairbank Highway Research Center | 6300 Georgetown Pike | McLean, VA | 22101