U.S. Department of Transportation
Federal Highway Administration
1200 New Jersey Avenue, SE
Washington, DC 20590
2023664000
Federal Highway Administration Research and Technology
Coordinating, Developing, and Delivering Highway Transportation Innovations
This report is an archived publication and may contain dated technical, contact, and link information 

Publication Number: FHWARD98133
Date: October 1998 

Accident Models for TwoLane Rural Roads: Segment and Intersections5. ModelingIntersection Models Models for the threelegged and fourlegged intersections in Minnesota and Washington are of Poisson and negative binomial type. Extended negative binomial models, appropriate for nonhomogeneous and variable stretches of road, are not attempted. The variables used to model accidents describe traffic volumes, horizontal and vertical alignment, channelization, roadside (driveways and hazard rating), intersection angle, and posted speed. Although sight distance is a desirable variable, data were not available. The alignment variables and hazard rating can be viewed as partial surrogates for sight distance. Because the intersection models are based on fewer observations than the segment models, and the relationships revealed between accidents and intersection variables are less clearcut, some adjustments are made in the criteria for retaining variables in the models. In order to identify design variables that influence accidents and are subject to control of designers, in many of the models Pvalues are allowed much greater range than in the segment models. Values as high as 30% occur in some models. To some extent this represents a shift in methodology. For a Pvalue of 5%, under the null hypothesis that a particular variable has no influence and thus has zero as its true coefficient, there is one chance in 20 that the estimate for the coefficient will be as far away from zero as, or farther away than, it is found to be. With a Pvalue of 30%, under the null hypothesis there are three chances in 10 that the estimate will be as far from zero as, or farther than, the actual estimate. The estimated coefficient is viewed as a fluctuation from zero due to random errors in the data. However, there is no compelling reason why the null hypothesis should govern the analysis, especially when engineering judgment suggests that the variable under study has an influence on accident counts. A defensible alternative is to view the estimated coefficient arrived at by maximum likelihood methods as a "best guess" whose confidence interval is measured by the standard error of the estimate. Larger Pvalues correspond to larger confidence intervals, perhaps intervals that include zero, but the estimate itself summarizes the data better than assignment of a zero coefficient and removal of the variable from the model. Adopting the "best guess" viewpoint is a more aggressive, less conservative stance toward the investigation of the underlying reality. Permitting larger Pvalues may be thought of as a partial transition toward the latter stance: we still show some deference toward the null hypothesis, but we attend closely to the estimate offered by the model, more closely the smaller its standard error. Tables 29 through 35 below exhibit the chief models of both Poisson and negative binomial type for both the threelegged and fourlegged intersections. For comparability, number of years is used as an offset so that what is modeled is mean number of accidents per year. Estimated coefficients for each variable are shown, along with their standard errors and Pvalues. Some variables were considered in the preliminary analysis that may not appear in the Tables  variants of the variables used here, as well as weather variables SNP and NONDRYP in Minnesota (these had negative sign and were not very significant). Tables 36 and 37 exhibit models for Injury Accidents. Traffic The chief variables are major and minor road traffic S ADT1 and ADT2. In addition the variable CINDEX, conflict index, measuring the relationship between these two was considered. In pre liminary runs it was not significant when used in addition to them, and it was less significant than either of them when used as a substitute for one of them. ADT1 and ADT2 have different relative effects in the threelegged versus the fourlegged cases (cf. Table 35):
For fourlegged intersections, major and minor road ADT have approximately equal influence, while for threeleggeds the major road ADT dominates. If one views a fourlegged intersection as two threelegged intersections, admittedly an oversimplification, and accordingly halves the coefficient of LADT2 in the last column above, the effects are seen to be roughly compatible.
Table 29. Poisson Models, 3Legged Intersections Accidents Regression Coefficients (Standard error and Pvalue in parentheses)
Table 30. Poisson Models, 4Legged Intersection Accidents Regression Coefficients (Standard error and Pvalue in parentheses)
Table 31. Negative Binomial Models, 3Legged Intersection Accidents Regression Coefficients (Standard error and Pvalue in parentheses)
Table 32. Negative Binomial Models, 4Legged Intersection Accidents Regression Coefficients (Standard error and Pvalue in parentheses)
Table 33. Additional Negative Binomial Models, Combined (MN/WA) Intersection Accidents Regression Coefficients (Standard error and Pvalue in parentheses)
Table 34. Additional Negative Binomial Models, Minnesota Intersection Accidents Regression Coefficients (Standard error and Pvalue in parentheses)
Table 35. Final Negative Binomial Models, Minnesota Intersection Accidents Regression Coefficients (Standard error and Pvalue in parentheses)
Table 36. Negative Binomial Models, 3Legged Intersection Injury Accidents Regression Coefficients (Standard error and Pvalue in parentheses)
Table 37. Negative Binomial Models, 4Legged Intersection Injury Accidents Regression Coefficients (Standard error and Pvalue in parentheses)
Alignment, Channelization, and Speed Two horizontal curve variables were used  HI and HEI  measuring degree of curvature out to 250 respectively 764 feet. These variables had unexpected sign and/or were insignificant in Washington State (for HI, see Tables 29, 30, 31, 32, 36, 37) but behaved somewhat better in Minnesota for both threelegged and fourlegged intersections. HI was more stable than HEI, and so for comparability we elected to use HI as our horizontal variable in the runs shown. Three vertical curve variables were considered  VCI, VI, and VEI. Each measures average grade change per hundred feet for vertical curves near the intersection. The first is for crests out to 250 feet, the second is for both crests and sags out to 250 feet, and the third is for both crests and sags out to 764 feet. In the Minnesota data  the larger of the two State data sets  VCI, the crest only variable and the vertical alignment variable most closely related to sight distance, was substantially more significant than VI and VEI, and hence was selected for inclusion in the runs presented here. On the Washington data the vertical curve variables tended to have unexpected sign and/or be very insignificant. Several measures of channelization were used in the modeling, but the measure that proved most significant was RT, which takes the values 1 or 0 whether there is or is not at least one right turn lane on the major road. Other channelization variables  for bypass lanes on threeleggeds, zero, one, or two right turn lanes on fourleggeds, or acceleration lanes for the minor roads  were not significant and/or did not show much variation. Thus RT represents channelization in all runs. On threelegged intersections its coefficient was consistently positive and significant. It is not known whether this variable is a surrogate for high accident intersections (i.e., because many accidents tend to occur at the intersection, a right turn lane has been added) or a surrogate for high right turn major road traffic (and high left turn minor road traffic). On the fourlegged intersections, the coefficient of RT tended to be negative but was not particularly significant. The speed variable SPDI, an average of approach speeds  although negatively correlated with ADT, the alignment variables, and number of driveways  seemed to make an independent contribution to the accident frequency in all models. Roadside Variables  Number of Driveways and Hazard Rating Perhaps the most remarkable feature of the intersection models is the unexpected but systematic behavior of the variables ND, number of driveways, and RHRI, Roadside Hazard Rating. The coefficient of RHRI is positive at threelegged intersections while that of ND is negative. The reverse occurs on fourleggeds: the coefficient of RHRI is negative and that of ND is positive. Because of the unexpected negative signs, ND has been omitted from some threelegged runs and RHRI has been omitted from some fourlegged runs. With respect to driveways, perhaps drivers take more care when driveways are to be found in the neighborhood of a threelegged intersection, but insufficient additional care in the neighborhood of a fourlegged intersection. Each driveway or intersection leg represents potential traffic and requires a share of driver attention. In the intersection data sets driveways actually occur at a larger percentage of threelegged intersections (62.5% in MN and 63% in WA according to Tables 4 and 5) than fourleggeds (32.4% in MN and 46.7% in WA according to Tables 6 and 7). At fourlegged intersections, it might be argued that driveways are a third unexpected complication in addition to the two minor road legs, less easily integrated than two complications at a threelegged: a driveway and one minor leg. With respect to hazard rating, an opposite and possibly inconsistent explanation might be offered: It may be that drivers underestimate roadside hazards at threeleggeds and relatively speaking overestimate them at fourleggeds. Roadside hazards such as obstacles and steep sideslope do not require the same kind of attention as potential traffic entry points. Perhaps such hazards are more likely to be properly attended to when both sides of the roadway have entry points and available accident avoidance tactics are more limited. The Angle Variable The variable HAU used in Tables 29 through 33 and 35 through 37 is a signed variable proposed by Ezra Hauer (see Figures 4 and 5). For a threelegged intersection HAU is positive when the angle is larger than 90° as in 4(a) and HAU is negative when the angle is smaller than 90° as in 4(b). On the basis of work of Kulmala (1995) it is thought that turns from the far lane of the major road may be less accident prone in situation 4(a) than in situation 4(b). Accordingly the coefficient of HAU in the threelegged intersection model would be negative (when HAU is positive accidents are less frequent; and when HAU is negative they are more frequent, it is proposed). Of course, there are other turns to be made: a turn from the near lane of the major road, and turns left and right from the minor road. The fourlegged version of HAU is the average of the HAU variable for two threelegged intersections (one to the right, one to the left), and would likewise have a negative coefficient if accidents owing to far lane turns through large angles are predominant. Tables 29 through 32 do not support any strong conclusion. Minnesota and Washington have opposite experience with the variable HAU. Minnesota angle data must be considered much more reliable, though, than Washington angle data. While Minnesota angles were determined from construction plans, those for Washington were very rough estimates made from photologs. Visibility of the direction of minor roads was extremely limited in the photologs. As Tables 4 through 7 indicate, for Minnesota threeleggeds 50.6% were reported as right angles versus 95.6% in Washington; for fourleggeds 37.6% were reported as right angles in Minnesota versus 88.9% in Washington. In the Minnesota Poisson models HAU is significant but the sign of its coefficient has unexpected value (positive) for the threeleggeds, although it behaves as expected for fourleggeds. Under the negative binomial models HAU is marginally significant for the Minnesota data with the same coefficient signs as for the Poisson. The two other angle variables considered in this study are DEV (the absolute deviation from 90° of the angle, or the average of the two absolute deviations for the fourleggeds) and DEV15 (the squared difference between DEV and 15°, divided by 100). The behavior of these three variables on the Minnesota data is summarized below.
Thus angle, however measured, is a significant variable at fourlegged intersections, and HAU is significant (but the others are not) at threeleggeds. DEV15 is an empirical variable developed in connection with study of the fourlegged intersections. On some runs of Minnesota fourlegged data it was more significant than DEV, suggesting that accident rates are highest at angles of 75° and 105°. It was also more significant than DEV on the combined Minnesota and Washington fourlegged data. For reasons of simplicity we omit DEV15 from our tables, although we did use DEV on some fourlegged runs (Tables 33 and 34). Negative Binomial Models  Minnesota versus Washington The statistics compiled in the lower rows of Tables 29 and 30 indicate that the Poisson models have definite explanatory power, especially the Minnesota models, but that they are nonetheless overdispersed. The values of T_{1} should be approximately normally distributed about zero if the overdispersion parameter is zero, but the values instead tend to be large positive numbers. The scaled deviance and the scaled Pearson chisquare likewise have values indicative of overdispersion. Accordingly we pass to negative binomial models in Tables 31 through 37. Tables 31 and 32 are negative binomial counterparts of Tables 29 and 30, with the same variables. In general the Poisson and negative binomial models are consistent with one another: coefficients have the same sign and similar magnitudes. In most cases the Pvalue of coefficients increases, the individual variables are thus less significant, and the overdispersion parameter K, a standin for omitted variables, makes a significant contribution to all of the negative binomial models. In Washington the overdispersion parameters are larger than in Minnesota, and fewer variables are significant. In particular, for the Washington threelegged models the marginally significant variables VCI and RHRI become insignificant as one passes from the Poisson model in Table 29 to the negative binomial model in Table 31. For the Washington fourlegged models the variables ADT1, HI, SPDI, RHRI, and RT become less significant from Table 30 to Table 32, with ADT1 and RHRI becoming insignificant. Because it is wellaccepted that ADT1 is an important variable, the quality of the data is called into question. The standard error for ADT1 is consistent with both a zero value and a much larger value (comparable to that of Minnesota). For all intersections in this study, the traffic data are imperfect. In rural sites they typically are based on spot measurements (part of a day at a site along the road near the intersection). Although efforts are made to average the data, with daily, weekly, seasonal, and annual variation taken into account, and with attempts to localize the count to the vicinity of the intersection, the results are not very reliable. Examination of files for both Minnesota and Washington shows that reported ADT for rural intersections is often the same from year to year (with no evidence that new measurements have been made or that paper estimates have been revised). When traffic data are available for all legs, sometimes they do not make sense: the difference in ADT between the two legs of the major road has no obvious relation to the minor road ADT. Efforts were made in this study to correct imperfections in the Minnesota intersection ADT, but because the Washington data were not part of an established data base, no similar efforts could be made with them. The Minnesota models are thus more trustworthy. Nonetheless, models for both sets of data, and for combined data, are included for comparison purposes. Where there is disagreement between Minnesota and Washington, the relevant variable should receive extra scrutiny and the evidence of Minnesota should be considered less conclusive than otherwise. Additional Negative Binomial Runs In Tables 33, 34, and 35 we exhibit additional negative binomial models for Minnesota and combined data. Table 33 shows combined data for both States with variables that are significant or reasonably close to significant in the "best guess" spirit. For the threeleggeds, compare Table 33 with the last column in Table 31: VCI and ND have been omitted. Both are very insignificant and ND has unexpected sign (more driveways lead to fewer accidents). For the fourleggeds, compare Table 33 with the last column of Table 32: HI is very insignificant and has been omitted; RHRI, although significant, has unexpected sign (the more hazardous the roadside the fewer the accidents) and has also been omitted. The State variable is not significant in any of these runs, but has been retained nonetheless. Table 34 shows Minnesota negative binomial runs where all but the most significant variables have been omitted. The results for the Minnesota threeleggeds are quite consistent with the Minnesota column of Table 31. For the fourleggeds either horizontal or vertical alignment can serve as significant explanatory variables but not both. Angular deviation DEV from 90° is also strongly significant; the fewer predicted accidents the greater the deviation. The runs in Table 34 keep only the most significant variables. Note that SPDI is not one of them; nor is HAU (but angle is represented by DEV). Negative Binomial Models for Injury Accidents We also exhibit negative binomial models for injury accidents (INJACC) in Tables 36 and 37. These tables are comparable to Tables 31 and 32 and show that the same coefficient magnitudes generally are to be found, although with reduced significance. With respect to the threelegged INJACC runs, the most significant variables besides ADT are Roadside Hazard Rating RHRI and channelization RT (in Minnesota and the combined data). This is similar to Table 31 where all accidents (TOTACC) are modeled. With respect to the fourlegged INJACC runs, RHRI is again significant but with unexpected sign, and this mirrors the behavior in Table 32 and elsewhere. Final Intersection Models The chief idiosyncrasies found in the various models are already present in the Poisson runs (Tables 29 and 30). We list some of these:
In view of the small size of the Washington State sample (the combined models are generally dominated by the Minnesota data), the nonrandom and ad hoc character of the Washington intersections (an "opportunity" sample), the lesser quality of some of the collected Washington data (e.g., traffic and angle), and the insignificance of variables of interest (including the State variable), we take the Minnesota models as fundamental. In particular, we offer the models in Table 35 as our final models for threelegged and fourlegged intersections. These models are based exclusively on Minnesota data, and significant variables and marginally significant ones are included where we have allowed greater latitude for the alignment variables in the spirit of a "best guess" approach. In these runs the variables with unexpected signs (ND for the threeleggeds and RHRI for the fourleggeds) have been omitted. These models are the best we have to offer. Their shortcomings become apparent by comparing them with Tables 31 and 32, where more variables are included and both States are represented.
