U.S. Department of Transportation
Federal Highway Administration
1200 New Jersey Avenue, SE
Washington, DC 20590
2023664000
Federal Highway Administration Research and Technology
Coordinating, Developing, and Delivering Highway Transportation Innovations
REPORT 
This report is an archived publication and may contain dated technical, contact, and link information 

Publication Number: FHWAHRT14081 Date: November 2014 
Publication Number: FHWAHRT14081 Date: November 2014 
The aim of this section is not to go in depth into the most commonly applied methodologies but rather to illustrate how they are applied in road safety research. Selected references are also provided where further information may be found.
The EB methodology for observational beforeafter studies is considered rigorous in that it accounts for regressiontothemean. In the process, SPFs are used and the use of these addresses the following:
In the EB approach, the change in safety for a given crash type at a site is given by the following:
Figure 7. Equation. Change in safety using the EB approach.
Where:
λ = expected number of crashes that would have occurred in the after period without the strategy.
π = number of reported crashes in the after period.
In estimating λ, the effects of RTM and changes in traffic volume are explicitly accounted for using SPFs, relating crashes of different types to traffic flow and other relevant factors for each jurisdiction based on untreated sites (reference sites). Annual SPF multipliers are calibrated to account for temporal effects on safety (e.g., variation in weather, demography, and crash reporting).
In the EB procedure, the SPF is used to first estimate the number of crashes that would be expected in each year of the before period at locations with traffic volumes and other characteristics similar to the one being analyzed (i.e., reference sites). The sum of these annual SPF estimates (P) is then combined with the count of crashes (x) in the before period at a strategy site to obtain an estimate of the expected number of crashes (m) before strategy. This estimate of m is computed as follows:
Figure 8. Equation. Expected number of crashes before strategy implementation using the EB approach.
Where w is estimated from the mean and variance of the SPF estimate as follows:
Figure 9. Equation. Estimated weight using the EB approach.
Where k = constant for a given model and is estimated from the SPF calibration process with the use of a maximum likelihood procedure. In that process, a negative binomial distributed error structure is assumed with k being the overdispersion parameter of this distribution.
A factor is then applied to m to account for the length of the after period and differences in traffic volumes between the before and after periods. This factor is the sum of the annual SPF predictions for the after period divided by P, the sum of these predictions for the before period. The result, after applying this factor, is an estimate of λ. The procedure also produces an estimate of the variance of λ.
The estimate of λ is then summed over all sites in a strategy group of interest (to obtain ) and compared with the count of crashes observed during the after period in that group ( ). The variance of λ is also summed over all sites in the strategy group.
The CMF (θ) is estimated as follows:
Figure 10. Equation. CMF estimate using the EB approach.
The standard deviation of θ is given by the following:
Figure 11. Equation. Standard deviation of the CMF estimate using the EB approach.
The percent change in crashes is calculated as 100(1  θ); thus a value of θ = 0.7 with a standard deviation of 0.12 indicates a 30 percent reduction in crashes with a standard deviation of 12 percent.
Crosssectional studies are particularly useful for estimating CMFs where there are insufficient instances where the treatment was applied to conduct a beforeafter study. For example, there may be few or no projects where the shoulder is widened, for example, from 4 to 6 ft. However, there would be many road segments with 4ft shoulders and many with 6ft shoulders. The reason that beforeafter studies are impractical in such cases is that there are often not enough beforeafter situations to allow for credible results.
In practice, it is difficult to collect data for enough locations that are alike in all factors affecting crash risk. Hence, crosssectional analyses are often accomplished through multiple variable regression models. In these models an attempt is made to account for all variables that affect safety. If such attempts are successful, the models can be used to estimate the change in crashes that results from a unit change in a specific variable. The CMF is derived from the model parameters. The regression approach for estimating a CMF is consistent with the belief that the CMF is a function of the traits of the treated unit. A crosssectional approach can be used to develop a CMFunction, and is preferable if the causeeffect relationship with crashes can be determined with confidence.
CMFs estimated from crosssection studies could be inaccurate for a number of reasons, including inappropriate functional form, omitted variable bias, or correlation of variables. It is common practice to use generalized linear modeling techniques, assuming a negative binomial error structure, to estimate multivariate crash prediction models. However, it is difficult to account for all factors that affect safety using such modeling techniques. For example, intersections with leftturn lanes also tend to have illumination. If a crash prediction model is used to estimate a CMF for leftturn lanes, and the presence of illumination is not accounted for in the model, the difference in model predictions with and without leftturn lanes could be partly due to illumination differences. Ironically, it is precisely because a variable is found to be correlated with another variable that it may be omitted during the model fitting exercise. Including correlated variables could in fact lead to effects that are counterintuitive (e.g., illumination increases night time crashes).
Casecontrol methods have been used in certain areas of highway safety, but few have focused on the effects of geometric design elements. For example, casecontrol studies have been applied to investigate the effectiveness of motorcyclehelmet use and the crash risk of hours of service for truck drivers. More recently, the casecontrol method was employed to estimate CMFs for geometric design elements, including lane and shoulder width. Casecontrol studies assess whether exposure to a potential risk factor is disproportionately distributed between the cases and controls, thereby indicating the likelihood of an actual risk factor.
The likelihood of an actual risk factor is expressed as the odds ratio between two levels of a variable. For example, it may be found that the odds of a crash occurring on horizontal curves with a degree of curvature greater than 15 degrees is 1.5 times the odds of a crash occurring on curves less than 15 degrees. The odds ratio is a direct estimate of the CMF. Risk factors may take the form of binary variables (e.g., median barrier, roadway lighting, or guiderail) or multilevel variables such as lane width (e.g., 9, 10, 11, and 12ft lanes). The sample is summarized by risk factor and casecontrol status to calculate the odds ratio. To illustrate the concept of the odds ratio, consider the data in table 2.
Risk Factor  Number of Cases  Number of Controls 

With  A  B 
Without  C  D 
The odds ratio (CMF) is expressed as the expected increase or decrease in the outcome in question due to the presence of the risk factor. An odds ratio greater than 1.0 suggests that the presence of the risk factor increases risk, while a value less than 1.0 would suggest a decrease in risk. Using the notation in the table the odds ratio (OR) is calculated as:
Figure 12. Equation. Odds ratio calculation.
Casecontrol studies cannot be used to measure the probability of an event (e.g., crash, severe injury, etc.) in terms of expected frequency. They are more often used to show the relative effects of risk factors. Statistical analyses, such as multiple logistic regression techniques, are commonly used to clarify these relationships because they are able to examine the risk associated with one factor while controlling for other factors.
Finally, the casecontrol method cannot demonstrate causality because there is no time sequence of events in the analysis. Instead, the odds ratio indicates the increased likelihood of a crash occurring when a risk factor (e.g., roadway characteristic is present. It does not, however, recognize differences between locations with many crashes or a single crash. This is a loss of potentially important information and thus, the true increase in risk could be underestimated.