U.S. Department of Transportation
Federal Highway Administration
1200 New Jersey Avenue, SE
Washington, DC 20590

Skip to content
Facebook iconYouTube iconTwitter iconFlickr iconLinkedInInstagram

Federal Highway Administration Research and Technology
Coordinating, Developing, and Delivering Highway Transportation Innovations

This report is an archived publication and may contain dated technical, contact, and link information
Back to Publication List        
Publication Number:  FHWA-HRT-14-081    Date:  November 2014
Publication Number: FHWA-HRT-14-081
Date: November 2014


Enhancing Statistical Methodologies For Highway Safety Research – Impetus From FHWA


This section provides summaries for a small, and mostly recent sample of road safety research that applied some of the advanced, innovative statistical techniques identified at the technical experts meeting. Some of this research culminated in papers presented at the very recent TRB Annual Meeting held in January 2014 in Washington, DC. The abstracts are reproduced from the papers.

The intent of this appendix is to illustrate that the application of these advanced techniques is topical and is not beyond the capabilities of road safety researchers and, in so doing, to stimulate others to pursue these techniques in their research to develop CMFs and SPFs.

Non-Parametric Regression

Elmitiny, N., Harb, R., Radwan, E. and Ahmed, M. “Traffic Operation Factors Related to Red-light Running: An Empirical Analysis.” Presented at the Transportation Research Board Annual Meeting, Washington, DC, January 2014.

Abstract: This paper investigates the relationship between the red light running phenomena and traffic parameters in the vicinity of intersections. Data was collected on two intersections in Central Florida for a period of 9 months using traffic monitoring cameras acquired from ITERIS. Collected data included traffic characteristics; signal timing data, as well as the frequency of red light running 7 days a week, 24 h a day. Using Augmented Multivariate Adaptive Regression Splines (MARS), a recursive non-parametric regression technique, it was determined that traffic volume, average speed, percentage time green, and percentage large vehicles in the traffic composition were strongly associated with red-light running. It was also observed that vehicular volume and percentage large vehicles have an interactive relationship with red-light running. Increase in percentage in traffic volume is associated with an increase in the red-light running.

Thakali, L., Fu, L., and Chen, T. “Comparison Between Parametric and Nonparametric Approaches for Road Safety Analysis: Case Study of Winter Road Safety.” Presented at the Transportation Research Board Annual Meeting, Washington, DC, January 2014.

Abstract: In road safety research, a parametric approach is commonly applied in modeling road collisions, which have resulted in many different types of models such as Poisson, Negative Binomial and Poisson lognormal. While easy to apply and interpret, a parametric approach has several critical limitations due to the modeling requirement of assuming a specific probability distribution form for each model variable (e.g., collision frequency) and a pre-specified functional relationship between each model parameter and the predictors. These assumptions, if violated, could lead to biased and/or erroneous inferences on the effect of these predictors on the dependent variable. This paper introduces a data-driven, nonparametric alternative called Kernel regression, which circumvents the need for the aforementioned assumptions. This paper compares the parametric and nonparametric approaches through an empirical study using a large dataset consisting of hourly observations of collisions, road weather and surface conditions, and traffic counts from highways in Ontario, Canada, over six winter seasons. It is shown that the nonparametric approach has the advantage of being able to capture the significant nonlinear and interacting effects of some condition factors. The paper also illustrate the practical implications of the differences between the two approaches, including evaluation of the risk levels of road surface conditions for the road users and quantification of safety benefits of maintenance operations for transportation authorities.

Bayesian Hierarchical Models

Chen, Y. and Persaud, B. “Methodology to Develop Crash Modification Functions for Road Safety Treatments with Fully Specified and Hierarchical Models.” Presented at the Transportation Research Board Annual Meeting, Washington, DC, January 2014.

Abstract: CMFs for road safety treatments are developed as multiplicative factors that are used to reflect the expected changes in safety performance associated with changes in highway design and/or traffic control features. However, current CMFs have methodological drawbacks. For example, variability with application circumstance is not well understood, and, as important, correlation is not addressed when several CMFs are applied multiplicatively. These issues can be addressed by developing SPFs with components of CMFunctions, an approach that includes all CMF related variables, along with others, while capturing quantitative and other effects of factors and accounting for cross-factor correlations. CMFunctions can capture the safety impact of factors through a continuous and quantitative approach, avoiding the problematic categorical analysis that is often used to capture CMF variability. There are two formulations to develop such SPFs with CM-Function components—fully specified models and hierarchical models. Based on sample datasets from two Canadian cities, both approaches are investigated in this paper. While both model formulations yielded promising results and reasonable CMFunctions, the hierarchical model was found to be more suitable in retaining homogeneity of first-level SPFs, while addressing CM-Functions in sublevel modeling.

El-Basyouny, K., Barua, S., Islam, M. and Li, R. “Assessing the Effect of Weather States on Crash Severity and Type using Fully Bayesian Multivariate Safety Models.” Presented at the Transportation Research Board Annual Meeting, Washington, DC, January 2014.

Abstract: Rather than investigate the isolated effects of individual weather elements on crash occurrence, this study investigates the aggregated effect of weather states, which are defined as a combination of various weather elements (i.e., temperature, snow, rain, and wind speed), on crash occurrence. The main argument is that a combination of weather elements might better represent a particular weather condition and subsequent safety outcome. Therefore, to explore the effect of various weather states on crash severity and type, this study defined 12weather states, based on temperature, snow, rain and wind speed, and developed multivariate safety models using 11 years of daily weather and crash data for the entire City of Edmonton. The proposed models were estimated in a Full Bayesian context via a Markov Chain Monte Carlo simulation, while a posterior predictive approach was used to assess the models’ goodness of fit. Results suggested that Property-Damage-Only (PDO) crashes increased by 4.5–45 percent due to adverse weather states. It was also shown that PDO crashes were more affected by adverse weather states compared to severe (injury and fatal) crashes. With regard to crash type, adverse weather states were associated with an increased occurrence of 9–73.7 percent for all crash types, with the highest increase recorded for Ran-Off-Road (ROR) crashes. The duration of daylight hours was found to be significant and negatively related to all crash types and PDO crashes. In addition, sudden weather changes of major snow or rain were statistically significant and positively related to all crash types. Days-of-the-week (i.e., weekdays and weekend) and seasons-of-the-year (winter, spring, summer, and fall) were used as dummy variables and were statistically significant in relation to crash occurrence.

Principal Components Analysis

Papadimitriou E. and Yannis, G. “Is Road Safety Management Linked to Road Safety Performance?” Accident Analysis and Prevention, Volume 59, October 2013, pp. 593–603.

Abstract: This research aims to explore the relationship between road safety management and road safety performance at country level. For that purpose, an appropriate theoretical framework is selected, namely the “SUNflower” pyramid, which describes road safety management systems in terms of a five-level hierarchy: (i) structure and culture, (ii) programmes and measures, (iii) “intermediate” outcomes—safety performance indicators (SPIs), (iv) final outcomes—fatalities and injuries, and (v) social costs. For each layer of the pyramid, a composite indicator is implemented, on the basis of data for 30 European countries. Especially as regards road safety management indicators, these are estimated on the basis of Categorical Principal Component Analysis upon the responses of a dedicated road safety management questionnaire, jointly created and dispatched by the ETSC/PIN group and the “DaCoTA” research project. Then, quasi-Poisson models and Beta regression models are developed for linking road safety management indicators and other indicators (i.e. background characteristics, SPIs) with road safety performance. In this context, different indicators of road safety performance are explored: mortality and fatality rates, percentage reduction in fatalities over a given period, a composite indicator of road safety final outcomes, and a composite indicator of “intermediate” outcomes (SPIs). The results of the analyses suggest that road safety management can be described on the basis of three composite indicators: “vision and strategy,” “budget, evaluation and reporting,” and “measurement of road user attitudes and behaviours.” Moreover, no direct statistical relationship could be established between road safety management indicators and final outcomes. However, a statistical relationship was found between road safety management and “intermediate” outcomes, which were in turn found to affect “final” outcomes, confirming the SUNflower approach on the consecutive effect of each layer.

Spatial Kernel Averaging

Hadayeghi A., Shalaby A. and Persaud, B. “Development of Planning Level Transportation Safety tools using Geographically Weighted Poisson Regression.” Accident Analysis and Prevention, Volume 42, Issue 2, March 2010, pp. 676–688.

Abstract: A common technique used for the calibration of collision prediction models is the Generalized Linear Modeling (GLM) procedure with the assumption of Negative Binomial or Poisson error distribution. In this technique, fixed coefficients that represent the average relationship between the dependent variable and each explanatory variable are estimated. However, the stationary relationship assumed may hide some important spatial factors of the number of collisions at a particular traffic analysis zone. Consequently, the accuracy of such models for explaining the relationship between the dependent variable and the explanatory variables may be suspected since collision frequency is likely influenced by many spatially defined factors such as land use, demographic characteristics, and traffic volume patterns. The primary objective of this study is to investigate the spatial variations in the relationship between the number of zonal collisions and potential transportation planning predictors, using the Geographically Weighted Poisson Regression modeling technique. The secondary objective is to build on knowledge comparing the accuracy of Geographically Weighted Poisson Regression models to that of Generalized Linear Models. The results show that the Geographically Weighted Poisson Regression models are useful for capturing spatially dependent relationships and generally perform better than the conventional Generalized Linear Models.

Cox Proportional Hazards Model

Jovanis, P. and Chang, H-L. “Disaggregate Model of Highway Accident Occurrence Using Survival Theory.” Accident Analysis and Prevention, Volume 21, Issue 5, October 1989,
pp. 445–458.

Abstract: The analysis of discrete accident data and aggregate exposure data frequently necessitates compromises that can obscure the relationship between accident occurrence and potential causal risk components. One way to overcome these difficulties is to develop a model of accident occurrence that includes accident and exposure data at a mathematically consistent disaggregate level. This paper describes the conceptual and mathematical development of such a model using principals of survival theory. The model predicts the probability of being involved in an accident at time t given that a vehicle has survived until that time. Several alternative functional forms are discussed including additive, proportional hazards and accelerated failure time models. Model estimation is discussed for the case in which both accident and non-accident trips are included and for the case with only accident data. As formulated, the model has the distinct advantage of being able to consider accident and exposure data at a disaggregate level in an entirely consistent analytic framework. A conditional accident analysis is undertaken using truck accident data obtained from a major national carrier in the United States. Model results are interpretable and generally reasonable. Of particular interest is that segmenting accidents in several categories yields very different sets of significant parameters. Driver service hours seemed to most strongly effect accident risk: regularly scheduled drivers who take frequent trips are likely to have a reduced risk of an accident, particularly if they have a longer (greater than eight) number of hours off-duty just prior to a trip.



Federal Highway Administration | 1200 New Jersey Avenue, SE | Washington, DC 20590 | 202-366-4000
Turner-Fairbank Highway Research Center | 6300 Georgetown Pike | McLean, VA | 22101