U.S. Department of Transportation
Federal Highway Administration
1200 New Jersey Avenue, SE
Washington, DC 20590
2023664000
Federal Highway Administration Research and Technology
Coordinating, Developing, and Delivering Highway Transportation Innovations
This report is an archived publication and may contain dated technical, contact, and link information 

Publication Number: FHWARD03037
Date: May 2005 

Validation of Accident Models for IntersectionsFHWA Contact: John Doremi, PDF Version (1.61 MB)
PDF files can be viewed with the Acrobat® Reader® 2. VALIDATION OF ACCIDENT MODELS (Continuation)2.5.3 Model IIIThe summary statistics in the original report and Georgia data are given in table 62, which reveals that the Georgia sample had on average fewer accidents per year than the original data. This implies that either the Georgia sites were relatively more safe than the sites selected for the original model, or that the passage of time between the period for the original calibration (199395) and that for the validation data (199697) had resulted in an overall improvement in safety (due to many factors including improved roadway design, improved vehicles, emergency response services, etc.). The Georgia sites may also be safer because they have, on average, wider medians on major roads and fewer numbers of driveways than the original intersections. In addition, more than 50 percent of the sites in the original data had no median, while only 5 percent of sites in Georgia were without a median. Table 62. Summary Statistics of Georgia Data: Type III Sites
^{1} N/A: not available Total Accident Models (TOTACC)The model was recalibrated using the Georgia data. The parameter estimates, their standard errors, and pvalues are shown in table 63, which reveals that the constant term, AADT1 and AADT2, were estimated with the same sign but with large differences in magnitude. MEDWDTH2 and DRWY1 were estimated with an opposite sign, although they were not statistically significant. AADT1 was also estimated as insignificant. The overdispersion parameters were lower than that for the original model, but the difference was not great. Table 63. Parameter Estimates for TOTACC Type III Model Using Georgia Data
^{1} Vogt, 1999, (p. 111) ^{2} K: Overdispersion value Table 64 shows the prediction performance statistics for Model III for TOTACC. Low Pearson productmoment correlation coefficients with the Georgia data indicate that the accident predictions by the original model are marginally correlated with the observed number of accidents in the Georgia data. Other validation statistics also suggest a poor fit of the original model to the Georgia data. The MPB and MAD per year were larger than those for the original model. The MSPE per year squared was almost twice as high as the MSE per year squared. Figure 5 depicts the prediction performance of the original model for individual sites in the Georgia 0.05mile data. It is quite evident that the original model does not do a good job of predicting accidents at the Georgia intersections; this finding was expected on the basis of the low Pearson productmoment coefficients for the Georgia data. Table 64. Validation Statistics for TOTACC Type III Model Using Georgia Data
Figure 5. Observed versus Predicted Accident Frequency: TOTACC Type III Intersection Related Total Accident Model (TOTACCI)The parameter estimates, their standard errors, and pvalues are given in table 65. Similar to the model of TOTACC, the variables AADT1, MEDWITH1, and DRWY1 were estimated as statistically insignificant. The constant term and AADT1, AADT2, and DRWY1 were estimated with the same direction of effect but with large differences in magnitude. The overdispersion parameters, K, were lower than that for the original model. Table 65. Parameter Estimates for TOTACCI Type III Model Using Georgia Data
^{1} Vogt, 1999, (p. 112) ^{2} K: Overdispersion value Table 66 shows the GOF statistics for Model III for TOTACCI. Low Pearson productmoment correlation coefficients with the Georgia data indicate that the accident predictions by the original model are marginally correlated with the observed number of accidents in the Georgia data. Other validation statistics also suggest lackoffit to the Georgia data. The MPB and MAD per year were larger than those for the original model. The MSPE per year squared was almost twice as high as the MSE per year squared, indicating a general lackoffit to the Georgia data. A plot of the predicted versus actual accidents using Georgia data will help to understand prediction performances of the original model for the Georgia data. As shown in figure 6, it is quite evident that the original model does not do a good job of predicting accidents at the Georgia intersections; this finding was expected on the basis of the low Pearson productmoment coefficients for the Georgia data. Table 66. Validation Statistics for TOTACCI Type III Model Using Georgia Data
^{1} N/A: not available Figure 6. Observed versus Predicted Accident Frequency: TOTACCI Type III Injury Accident Model (INJACC)The two original variants for model III were validated. Variant 1The parameter estimates, their standard errors, and pvalues are given in table 67, which reveals that the constant term and all of the variables were estimated with the same sign as in the original model, but there were large differences in their magnitudes. The constant term, AADT1, and HAU became insignificant with the Georgia data. The overdispersion parameters, K, were higher than that for the original model. Table 68 shows the GOF measures for the original injury accident model (Variant 1) in the Vogt report applied to the Georgia data.^{(2)} The Pearson productmoment correlation coefficients were similar to those for the TOTACC model. However, the MPB, MAD, and MSPE per year squared were smaller than those for the TOTACC model. Table 67. Parameter Estimates for INJACC Type III Model Using Georgia Data: Variant 1
^{1} Vogt, 1999, (p. 113) ^{2} K: Overdispersion value Table 68. Validation Statistics for INJACC Type III Model Using Georgia Data: Variant 1
Figure 7 depicts the prediction performance of the original model for individual sites in the Georgia 0.05mile data. It is quite evident that the original model does not do a good job of predicting accidents at the Georgia intersections, a finding that would have been expected on the basis of the low Pearson productmoment coefficients for the Georgia data. Figure 7. Observed versus Predicted Accident Frequency: Injury Variant 1 Variant 2The parameter estimates, their standard errors, and pvalues are given in table 69, which reveals that the constant term and all of the variables were estimated with the same sign as in the original model. However, all of the variables except AADT2 became insignificant, and there were large differences in the magnitudes of the parameters. The overdispersion parameter, K, was almost twice as high as for the original model. Table 70 shows the GOF measures for the original injury accident model (Variant 2) in the Vogt report applied to the Georgia data.^{(2)} The Pearson productmoment correlation coefficient was similar to that for TOTACC. However, the MPB, MAD, and MSPE per year squared were smaller than those for TOTACC. Figure 8 depicts the prediction performance of the original model for individual sites in the Georgia 0.05mile data. It is quite evident that the original model performs poorly when applied to the Georgia data, a finding that would have been expected on the basis of the low Pearson productmoment coefficients for the Georgia data. Table 69. Parameter Estimates for INJACC Type III Model Using Georgia Data: Variant 2
^{1} Vogt, 1999, (p. 113) ^{2} K: Overdispersion value Table 70. Validation Statistics for INJACC Type III Model Using Georgia Data: Variant 2
Figure 8. Observed versus Predicted Accident Frequency: Injury Variant 2 2.5.4 Model IVThe summary statistics are provided in table 71. Peak leftturn percentage on major road was not available in the Georgia data, since this variable would be too costly to collect in the field. Since the variable was not present in the Georgia data, modifications to the validation procedure had to be performed. The variable was removed from the original model by dividing both sides of the model equation by the exponential value of the coefficient of the variable times its average effect (the average effect of PKLEFT1 is the average value of PKLEFT1 in the calibration data). The summary statistics showed that about 31 percent of the sites in the original data had no leftturn lane, while 17 percent in the Georgia data were without a leftturn lane. The summary statistics for all of the three States (California, Michigan, and Georgia) were also compared (refer to table 72). All of the sites in Michigan had no LTLN1S, while frequencies of TOTACC and TOTACCI for Georgia were higher than for the California data. Pearson correlations of the original data, Georgia, and California are given in table 73. The observation that the coefficients for AADT 2 and LTLN1S estimated using Georgia data resulted in opposite signs than the original model required further investigation. Pearson correlations for these variables with the response (accident frequency) were computed for all three StatesCalifornia, Michigan, and Georgia. Recall that the Pearson correlation reflects the degree to which the two variables are linearly related. Unlike the original data, AADT2 in Georgia is estimated as negative linearly related with TOTACC and TOTACCI, but these correlations are marginal and statistically insignificant. The variable LTLN1 is positively related with TOTACC and TOTACCI in Georgia and California (not significant), but is negative and significant for the Michigan data. Table 71. Summary Statistics of Georgia Data: Type IV
^{1} N/A: not available Table 72. Summary Statistics of California, Michigan, and Georgia
^{1} Summary Statistics for California and Michigan were produced using the obtained original data ^{2} Used TOTACC and TOTACCI for 0.05 mile Table 73. Pearson Correlations: Original, Georgia, and California The original data (N=72)
Georgia (N= 52): 0.05 mile
^{1} N/A: not available Table 73 . Pearson Correlations: Original, Georgia, and California (Continued) California (N=54)
^{1} N/A: not available Total Accident Models (TOTACC)The parameter estimates, their standard errors, and pvalues are given in table 74. Since the variable PKLEFT1 (peak leftturn percentage on major road) is not present in the Georgia data, modifications to the validation procedure had to be performed as described earlier. In the validation, the same parameter estimates in the originally published report were used, and the parameter estimates were also reproduced without PKLEFT1 for the revised original model ("Revised Estimates" in table 74). In the revised original model, all of the variables were estimated with the same sign but with large differences in magnitude. The effect of AADT1 became smaller, while that of AADT2 became larger. The overdispersion values with the Georgia data were higher than for the original models, but the difference was not great. For the Georgia data, the constant term and AADT1 were estimated with the same sign as for the original models. However, AADT2 and LTLN1S were estimated with an opposite sign to the original model, although AADT2 was insignificant. The values of the overdispersion parameter K for the Georgia data were lower than those for the original data. Table 74. Parameter Estimates for TOTACC Type IV Model Using Georgia Data
^{1} Vogt, 1999, (p. 116) ^{2} Coefficient estimates of the variables were reproduced without PKLEFT1 using the original data ^{3} K: Overdispersion value ^{4} N/A: not available Since PKLEFT1 was not available in the Georgia data, two models (original model and revised original model) were used for the validation activity to determine GOF measures. For the original model, the same parameter estimates in the report were used. For the revised original model, since PKLEFT1 was not available, PKLEFT1 was removed from the original model by dividing by the exponential value of the coefficient of this variable times its average effect, i.e., the average value of PKLEFT1. GOF measures of the revised original model, shown in table 75, indicate that it could be a good alternative to the original model. Pearson productmoment correlation coefficients, MAD per year, and MSE per year squared were similar to those for the original model. The MPB per year was higher than that for the original model, but the difference was not great. Values of 0.05 and 0.08 of the Pearson productmoment correlation coefficient indicate that the accidents in the Georgia data are not linearly related with the modelpredicted values. This could be the result of a significant nonlinearity in the data and original model. The MPB and MAD per year for the Georgia data were larger than those for the original year data. The MSPEs per year squared were also higher than the MSEs per year squared. Figure 9 depicts the prediction performance of the original model for individual sites in the Georgia 0.05mile data. It is quite evident that the original model does not fit the Georgia data well, a finding that would have been expected on the basis of the low Pearson productmoment coefficients for the Georgia data. Table 75. Validation Statistics for TOTACC Type IV Model Using Georgia Data
^{1} Used the original main model in the report. This model includes PKLEFT1 ^{2} Used the same coefficients in the original model, but PKLEFT1 was removed from the model by dividing by the exponential value of the coefficient of this variable times its average effect ^{3} Used the revised original model ^{4} N/A: not available Figure 9. Observed versus Predicted Accident Frequency: TOTACC Intersection Related Total Accident Model (TOTACCI)The parameter estimates, their standard errors, and pvalues are given in table 76. As before, the two models (original model and revised original model) were used for the validation. For the original model, the same parameter estimates in the report were used. Since the report also developed a model with AADT1 and AADT2 only, which model ("Revised Estimates" in table 76) was included for the validation. In the alternative original model the constant term and parameter estimates of AADT1 and AADT2 were estimated with the same sign but with some difference in magnitude. The effect of AADT1 became smaller, while that of AADT2 became larger. The overdispersion value was slightly higher than for the original model. For the Georgia data, AADT2 was estimated with an opposite sign to that of the original models. However, it was statistically insignificant, and the impact of the variable on the accident prediction was marginal. The constant term and AADT1 were also estimated as insignificant for the Georgia data. The overdispersion values for the Georgia data were similar to that for the revised original model. The prediction performance measures are shown in table 77. As was the case for the TOTACC models, the revised model showed similar prediction performance measures to the original model. Table 76. Parameter Estimates for TOTACCI Type IV Model Using Georgia Data
^{1} Vogt, 1999, (p. 117) ^{2} The report presents this model developed with AADT1 and AADT2 only ^{3} K: Overdispersion value ^{4} N/A: not available Table 77. Validation Statistics for TOTACCI Type IV Model Using Georgia Data
^{1} Used the original main model in the report. This model includes PKLEFT1 ^{2} Used the same coefficients in the original model, but PKLEFT1 was removed from the model by dividing by the exponential value of the coefficient of this variable times its average effect ^{3} Used the revised original model ^{4} N/A: not available Values of 0.16 and 0.l7 of the Pearson productmoment correlation coefficients indicate that the accident predictions by the original models are not strongly linearly correlated with the observed number of accidents in the Georgia data. Again, there are several possible explanations for this. The MPBs and MAD per year was larger than those for the original models. The MSPEs per year squared were also slightly higher than the MSEs per year squared. Figure 10 depicts the prediction performance of the original model for individual sites in the Georgia 0.05mile data. It is quite evident that the original model does not fit the Georgia data well, a finding that would have been expected on the basis of the low Pearson productmoment coefficients. Figure 10. Observed versus Predicted Accident Frequency: TOTACCI Injury Accident Model (INJACC)The parameter estimates, their standard errors, and pvalues are given in table 78. Again, all of the variables including the constant term were insignificant for the Georgia data, and AADT2 was estimated with an opposite sign to that of the original model. The overdispersion values for the Georgia data were higher than that for the original model. Table 78. Parameter Estimates for INJACC Type IV Model Using Georgia Data
^{1} Vogt, 1999, (p. 118) ^{2} PKLEFT1 was not included in the model ^{3} K: Overdispersion value ^{4} N/A: not available 