U.S. Department of Transportation
Federal Highway Administration
1200 New Jersey Avenue, SE
Washington, DC 20590
Federal Highway Administration Research and Technology
Coordinating, Developing, and Delivering Highway Transportation Innovations
|This report is an archived publication and may contain dated technical, contact, and link information|
Publication Number: FHWA-HRT-04-100
Date: September 2005
Safety Effects of Marked Versus Unmarked Crosswalks at Uncontrolled Locations Final Report and Recommended Guidelines
PDF Version (3.21 MB)
PDF files can be viewed with the Acrobat® Reader®
For the purpose of assessing pedestrian safety, an ideal study design would involve removing all crosswalks in several test cities, then randomly assigning sites for crosswalk markings and to serve as unmarked control sites. However, due to liability considerations, it would be impossible to get the level of cooperation needed from the cities to conduct such a study. Also, such random assignment of crosswalk marking locations would result in many crosswalks not being marked at the most appropriate locations.
Given such real-world constraints, a treatment and matched comparison site methodology was used to quantify the pedestrian crash risk in marked and unmarked crosswalks. This study design allowed for selection of a large sample of sites in cities throughout the United States where marked crosswalks and similar unmarked comparison sites were available. At intersections, the unmarked crosswalk comparison site was typically the opposite leg of the same intersection as the selected marked crosswalk site. For each marked midblock crosswalk, a nearby midblock crossing location was chosen as the comparison site on the same street (usually a block or two away) where pedestrians were observed to cross. (Even though an unmarked midblock crossing is not technically or legally a crosswalk, it was a suitable comparison site for a midblock crosswalk). The selection of a matched comparison site for each crosswalk site (typically on the same route and very near the crosswalk site) helped to control for the effects of vehicle speeds, traffic mix, and a variety of other traffic and roadway features.
A before-after study design was considered impractical because of regression-to-the-mean problems, limited sample sizes of new crosswalk installations, and other factors. A total of 1,000 marked crosswalk sites and 1,000 matched unmarked (comparison) crossing sites in 30 cities across the United States (see figure 11) were selected for analysis. In this study, no attempt was made to actually paint any of the 1,000 unmarked crosswalks to determine any crash effects in a before and after study. Instead, a separate (companion) study was conducted to monitor the effects of marking crosswalks on pedestrian and motorist behaviors. These study results are discussed in chapter 3 of this report.
Test sites were chosen without any prior knowledge of their crash history. School crossings were not included in this study because the presence of crossing guards and/or special school signs and markings could increase the difficulty of quantifying the safety effects of crosswalk markings.
Test sites were selected from the following cities:
Detailed information was collected at each of the 2,000 sites, including pedestrian crash history (average of 5 years per site), daily pedestrian volume estimates, ADT volume, number of lanes, speed limit, area type, type of median, type and condition of crosswalk marking patterns, location type (midblock or intersection), and other site characteristics. It was recognized that pedestrian crossing volumes would likely be different in marked and unmarked crosswalks. This study design involved collecting pedestrian volume counts at each of the 2,000 sites, and controlled for differences in pedestrian crossing exposure. The study computed pedestrian crashes per million crossings to normalize the crash data for pedestrian crossing volumes, as described below in more detail.
All of the 1,000 marked crosswalks had one of the marking patterns shown in figure 12 (i.e., none had a brick pattern for the crosswalk). Of the 2,000 crosswalks, 1,622 (81.2 percent) were at intersections; the others were at midblock. Very few of the marked crosswalks had any type of supplemental pedestrian warning signs. While not much information currently exists on the safety effects of various types of warning signs (under various conditions), a behavioral evaluation of several innovative signs performed in 2000 by Huang et al. may be found at www.walkinginfo.org/rd. (25) Furthermore, none of the test sites had traffic-calming measures or special pedestrian devices (e.g., in-pavement flashing lights). Estimates of daily pedestrian volumes at each crosswalk site and unmarked comparison site were determined based on pedestrian volume counts at each site, which were expanded to estimated daily pedestrian volume counts based on hourly adjustment factors. Specifically, at each of the 2,000 crossing locations, trained data collectors conducted onsite counts of pedestrian crossings and classified pedestrians by age group based on observations.
Pedestrian counts were collected simultaneously for 1 hour at each of the crosswalk and comparison sites. Full-day (8- to 12-hour) counts were conducted at a sample of the sites and were used to develop adjustment factors by area type (urban, suburban, fringe) and by time of day. The adjustment factors were then used to determine estimated daily pedestrian volumes in a manner similar to that used by many cities and States to expand short-term traffic counts to average annual daily traffic (AADT). Performing the volume counts simultaneously at each crosswalk site and its matched comparison site helped to control for time-related influences on pedestrian exposure. Further details of the data collection methodology are given in appendix A.
This study was structured to address a variety of questions related to crosswalks and pedestrian crashes. The primary analysis question was, "What are the safety effects of marked versus unmarked crosswalks?"
Several other analysis questions needed to be answered as well, including:
The amount of pedestrian crash data varied somewhat from city to city and averaged approximately 5 years per site (typically from about January 1, 1994 to December 31, 1998). Police crash reports were obtained from each of the cities except for Seattle, WA, (where detailed computerized printouts were obtained for each crash). Crashes were carefully reviewed to assign crash types to ensure accurate matching of the correct location and to determine whether the crash occurred at the crossing location (i.e., at or within 6.1 m (20 ft) of the marked or unmarked crossing of interest).
Standard pedestrian crash typology was used to review police crash reports and determine the appropriate pedestrian crash types (e.g., multiple threat, midblock dartout, intersection dash), as discussed later in this report. All treatment (crosswalk) and comparison sites were chosen without prior knowledge of crash history. All sites used in this study were intersection or midblock locations with no traffic signals or stop signs on the main road approach (i.e., uncontrolled approaches). This study focused on pedestrian safety and, therefore, data were not collected for vehicle-vehicle or single-vehicle collisions, even though it is recognized that marking crosswalks may increase vehicle stopping, which may also affect other collision types.
The selected analysis techniques were deemed to be appropriate for the type of data in the sample. Due to relatively low numbers of pedestrian crashes at a given site (many sites had zero pedestrian crashes in a 5-year period), Poisson modeling and negative binomial regression were used to analyze the data. Using these analysis techniques allowed determination of statistically valid safety relationships. In fact, there were a total of 229 pedestrian crashes at the 2,000 crossing sites over an average of 5 years per site. This translates to an overall average of one pedestrian crash per crosswalk site every 43.7 years.
While this rate of pedestrian crashes seems small on a per-site basis, it must be understood that many cities have hundreds or thousands of intersections and midblock locations where pedestrians regularly cross the street. Considering that pedestrian collisions with motor vehicles often result in serious injury or death to pedestrians, it is important to better understand what measures can be taken by engineers to improve pedestrian safety under various traffic and roadway conditions.
All analyses of crash rates at marked and unmarked crosswalks took into account traffic volume, pedestrian exposure, and other roadway features (e.g., number of lanes). To supplement the pedestrian crash analysis, a corresponding study was conducted on pedestrian and driver behavior before and after marked crosswalks were installed at selected sites in California, Minnesota, New York, and Virginia, as discussed earlier.(13,14)
The Poisson and negative binomial regression modeling were conducted in two ways in terms of how the comparison sites were handled. These were:
Analyses were conducted using both assumptions to insure that the results were not influenced merely by the manner in which the matching was conducted.
The analyses revealed very similar results using either of the assumptions listed above in terms of:
In short, using either analysis approach-grouping comparison sites or using an analysis that matches marked and unmarked sites-produced nearly identical results. The discussion below includes results of both analysis approaches.
At each of the 2,000 crossing sites, at least 1 hour-long count of pedestrian street crossings was conducted. Based on the time of day of the count, an expansion factor was used to compute an approximate pedestrian ADT. At a given observation site, i, a count ni is made of pedestrians crossing the street during some interval of time Ti. Now, from a standard pedestrian volume by time of day distribution, the proportion pi of daily pedestrian traffic expected during Ti can be determined. If ni 1 0, an estimate of the daily total pedestrian volume is made by, Ni = ni/pi.
This estimate has the property that if Ni was known, then the estimated pedestrian volume during the interval Ti would be Nipi = ni, the observed number.
A detailed discussion of how pedestrian ADTs were determined based on short-term pedestrian crossing counts is given in appendix A.
Assuming that motor vehicle volumes, speeds, and other site features remain constant, it is reasonable to expect that the number of pedestrian crashes will increase as the number of pedestrians crossing the street (pedestrian exposure) increases. When comparing sites to see which has the greatest risk of a pedestrian crash, it is necessary to control for the number of pedestrians. The pedestrian crash rate is a more appropriate measure of safety than the total number of pedestrian crashes for comparing the relative safety of marked and unmarked crosswalks, particularly since pedestrian crossing volumes differ at marked and unmarked crosswalks. In this study, crash rates were calculated in terms of crashes per million pedestrian crossings. For example, if an average of 1,000 pedestrians cross an intersection every day, then there will be 365,000 (or 0.365 million) pedestrian crossings in a year. The number of pedestrian crashes in a year is then divided by 0.365 million times the number of years to get the pedestrian crash rate.
The following analysis was conducted to determine which traffic and roadway variables have a significant effect on pedestrian crashes. Table 1 shows some summary values of pedestrian volumes and crashes for marked and unmarked crosswalks categorized by number of lanes.
For each marked crosswalk, a closely matched unmarked comparison site was chosen-usually a nearby site on the same street. Quite often, the comparison site was the opposite approach to the same intersection (on the same road). As a result of this matching, the distributions of site characteristics, including traffic volumes, should be essentially the same for marked and unmarked sites. Pedestrian volumes were recorded at a marked crosswalk and its matched unmarked location at essentially the same time of day and for an equal period of time. Thus, pedestrian volumes were free to vary between marked and unmarked sites but were collected in such a way as to represent equal proportions of expected daily pedestrian traffic at the respective locations.
*Ped. Vol. = Sum of the pedestrian ADT at sites within a given grouping (by number of lanes).
**Avg. Yrs. = Average number of years of crash data per site.
The pedestrian ADT per site was 312 at marked crosswalks and 155 at unmarked crosswalks, as shown in table 1. Thus, 66.8 percent of this pedestrian volume occurred at marked crosswalk sites. A total of 229 pedestrian crashes were recorded at these 2,000 sites over a period of roughly 5 years. If marked and unmarked crosswalks were equally safe (or unsafe), then given that 229 crashes occurred, it would be expected that 66.8 percent of them (153 crashes) would have occurred at marked crosswalk sites. This expected number is considerably smaller than the actual number of 188 observed at marked crosswalks. Under the hypothesis of equal safety, and conditional on 229 total crashes, the probability of observing 188 or more crashes at the marked sites can be obtained from the binomial distribution with parameters, p = .668 and n = .229, as
Thus, the hypothesis of equal safety across the entire set of sites would be rejected.
On the other hand, there may be subsets defined by various site characteristics where such a hypothesis would not be rejected. For example, consider the first two rows of table 1, which refer to sites on streets having two lanes. At these sites, 62.7 percent of the pedestrian volume occurred on marked crosswalks. Of the 60 crashes that occurred at these sites, 37.6 crashes would be expected at the marked crosswalk sites compared with the observed count of 37. Clearly, the hypothesis of equal safety could not be rejected for this subset of sites. In other words, for the two-lane road sites in the database, there was no significant difference in pedestrian crashes between marked and unmarked crosswalks.
From the rows of table 1 corresponding to three- or four-lane roads and roads with five or more lanes, the observed crash frequencies for the marked crosswalk sites are 94 and 57, respectively. Both totals considerably exceed the expected values of 77.6 and 45.7 based on proportions of pedestrian exposure at these sites. The probabilities of observing values this extreme by chance are:
In the expressions given above, the parameters p1 and p2 represent proportions of pedestrian volumes at marked sites adjusted for slight differences in exposure times over which crash data were obtained. These results suggest that, in general, marked crosswalks are less safe than unmarked crosswalks on streets having more than two lanes, but that the two types do not differ significantly on streets with two lanes. Note that the analysis described above did not require adjustment for motor vehicle volume, since matched pairs of marked and unmarked sites typically were selected at or near the same intersection where vehicle volumes were similar.
To investigate the relationship between other factors and combinations of factors on crosswalk pedestrian crashes, generalized linear regression models were fit to the data to predict crashes as functions of these variables. Consider a model based on pedestrian volumes (ADP); traffic volumes (ADT); and two indicator variables, one which indicates one or two travel lanes (L2), and the other which indicates three or four travel lanes (L4). The resulting model has the form
where E (Accsi) is expected pedestrian crashes at site i, yrsi is the number of years over which crash data was available for site i, and b0, b1, ... , b4 are parameters to be estimated. Models of this form were fit to data from marked and unmarked crosswalks separately. The models were fit by maximum likelihood methods using Procedure for General Models (PROC GENMOD) software, as developed by the SAS Institute. Crashes were assumed to follow a negative binomial distribution.
Parameter estimates for these basic models are shown in table 2.
*S.E. = Standard Error
For marked crosswalks, the results in table 2 show that expected crashes increased to a significant degree with both increasing pedestrian volume and increasing traffic volumes, with a much steeper increase for traffic volume. The lane variables compare two-lane roads with roads having five or more lanes, and three- or four-lane roads with roads having five or more lanes. The two-lane variable is marginally significant, while the three- or four-lane variable is not. The overall lanes effect (not shown) is significant (p-value of .0262). In subsequent models, a two-level lanes effect comparing two lanes with three or more is used. This variable is usually significant at a level of about .02.
The results for unmarked crosswalks show the only statistically significant effect to be for pedestrian volume. Thus, expected crashes on unmarked crosswalks increased consistently with increasing pedestrian volumes (at a somewhat higher rate than that at marked crosswalks), but did not change consistently with increasing traffic volumes or with number of lanes. These results suggest that multilane streets with low traffic volumes might represent another subset of the data where marked and unmarked crosswalks might not differ significantly with respect to safety. This issue is addressed in more detail later in the report.
In addition to the variables included in the models presented above, data were available for several other factors potentially associated with crosswalk safety. These included:
Neither speed limit nor crosswalk location (intersection or midblock) had a significant effect in the models for marked or unmarked crosswalk crashes. Initially, three types of medians were compared with no median. These were:
Several specific types of crosswalks were represented in the data, but the primary comparison came down to a comparison between the standard markings (two parallel lines) versus designs with more markings (e.g., continental or ladder patterns shown in figure 12).
In attempting to estimate these more detailed models, it was also a concern to consider effects due to specific locations (i.e., cities, States, regions) from which the data were obtained since crashes, types of medians and crosswalks, and other variables were not uniformly distributed across these locations. To this end, two sets of regions were identified (North-South and East-Midwest-West), and class variables indicating these regions were included in the models. A second approach was to estimate a model using data from all locations, then to re-estimate the model while omitting the data from each of the eight cities where the most data had been obtained, one step at a time, to see how the estimates changed. These eight cities and the total number of observation sites at each are listed below.
A few iterations of this process resulted in a model for marked crosswalk crashes summarized in table 3. The model for table 3 contains no variable pertaining to crosswalk type, a single variable indicating a raised median as opposed to no median or another median type, and another variable indicating the western region of the country as opposed to the East or Midwest.
In some preliminary models, there was an indication that the crosswalk types with more markings were associated with slightly lower crash rates than the standard type. These results were not consistent across models and became quite nonsignificant when regional variables were included. Similarly, preliminary models indicated that raised medians were marginally better (associated with lower crash rates) than crosswalks having no median or painted medians, while two-way left turn lanes were significantly worse than the other types. With the addition of the East-Midwest-West regional variables, the two-way left turn lane effect became nonsignificant, and the raised median effect became more significant. All of the two-way left turn lanes in the study sample were in the western region. The two-way left turn lanes did not account for the estimated West effect, however, since this estimate remained virtually unchanged when the data from the two-way left turn lane sites were deleted from the model.
*S.E. = Standard Error
The North-South regional variable was not statistically significant. East-to-West effects were modeled as two variables, one comparing West to East, and the other comparing Midwest to East. The West-to-East comparison was significant, while the Midwest-to-East comparison was not. These variables were then collapsed to a single variable contrasting West with Midwest and East combined, which is the form used in the model of table 3. The apparent effect due to the western region was investigated further to see if this effect could be attributed to differing distributions of speed limits and/or numbers of lanes. This did not prove to be the case.
Table 4 shows estimates of the same model parameters on the data subsets obtained by leaving out the data from each of the major cities. In general, the estimates are quite consistent across the subsets. All estimates listed were statistically significant at a .05 level with the exception of the two marked with an asterisk. These were the raised median effects on the datasets that omitted data from New Orleans, LA, and from Milwaukee, WI. The p-values for these estimates were .10 and .08, respectively.
Results from the more detailed crash modeling on unmarked crosswalks are presented in tables 5 and 6. In contrast to the results of table 2, table 5 shows that when a variable indicating the presence of a median was included in the model, the effect of traffic volume (ADT) became statistically significant. As with marked crosswalks, various median types were also considered; in this case, a variable indicating a median of any type versus no median was the most relevant characterization. For unmarked crosswalks, the East, Midwest, and West comparisons showed the eastern region to have significantly lower crash rates than either the West or Midwest. Thus, a two-level variable contrasting east with the other two regions was used. The North-South comparison was again not significant.
* Not statistically significant at .05 level.
*S.E. = Standard Error
Table 6 shows the estimates of these model parameters were again consistent across the eight data subsets. The estimates marked with an asterisk (which were not significant at a .05 level) were the ADT effect on the subset with Seattle, WA, data omitted, and the ADT effect and eastern region effects on the subset with New Orleans, LA, data omitted. The p-values for these estimates were .06 in each case.
* Not statistically significant at .05 level.
While the models presented above examine the effects of medians, crosswalk designs, and other factors on pedestrian crashes, the primary factors associated with these crashes were shown to be pedestrian volumes and traffic volumes. Analyses based on the data shown in table 1 indicated no significant difference in the safety of marked and unmarked crosswalks on streets having two or fewer lanes, while marked crosswalks were less safe overall on multilane roads. The models suggest a further examination of multilane roads as a function of varying traffic volumes and the presence of raised medians.
Table 7 shows pedestrian volumes, crashes, and average exposure years for a number of categories defined by number of lanes, traffic volumes, and median type. Using the same approach as for table 1, a marked crosswalk exposure proportion, pmi, was computed for category i, as
where the sum extends over all sites (S) in category i, Xmi is the total exposure for marked crosswalks in category i, and Xumi is similarly defined as the total exposure for unmarked crosswalks in category i.
*Avg. Yrs. = Average number of years of crash data per site.
Then conditional on total crashes, Ni in category i, expected marked crosswalk crashes under the hypothesis of equal safety were estimated as Ãmi = Ni pmi. The probability under this hypothesis of observing as many or more crashes in marked crosswalks as actually occurred was obtained from the binomial distribution with parameters pi and Ni. Table 8 lists these quantities for the various crosswalk categories.
The results in table 8 suggest that on two-lane roads, multilane roads without raised medians and traffic volumes below 12,000 ADT, and multilane roads having raised medians and traffic volumes below 15,000 ADT, the hypothesis of equal safety for marked and unmarked crosswalks cannot be rejected.
In other words, there was no significant effect of marked versus unmarked crosswalks on pedestrian crashes under the following conditions:
For multilane roads with ADTs above these values, there was a significant increase in pedestrian crashes on roads with marked crosswalks, compared to roads with unmarked crosswalks (after controlling for traffic ADT and pedestrian ADT).
pm = Proportion of pedestrian exposure at marked crosswalks.
Am = Actual number of pedestrian crashes at the marked crosswalks.
E (Am) = Estimated (predicted) number of pedestrian crashes at marked crosswalks.
P (a > Am) = Binomial probabilities.
Each pedestrian in both the crash and exposure samples was classified into one of seven age categories: 12 and under, 13-18, 19-25, 26-35, 36-50, 51-64, and 65 and over. Across the entire set of sites, the two age distributions differed substantially, with a considerably higher proportion of young adults (19-35) in the exposure sample (compared to other age groups), and a much higher proportion of the oldest age group in the crash sample. The difference was statistically significant, χ2 6df = 216.86, p = .001.
The data were then partitioned into four subsets determined by marked or unmarked crosswalks on streets having two lanes or having three or more lanes. The same general pattern of the exposure and crash age distributions tended to hold on the subsets. In particular, the crash distribution tended to always be higher for the oldest pedestrian group. The relatively small sample sizes of crashes in some of the subsets necessitated combining some of the age categories to obtain a valid statistical comparison of the distributions.
Marked crosswalks on two-lane roads. There were 33 crashes in this subset. With seven age categories, several cells had expected counts of fewer than five, so the two youngest and the two oldest age groups were combined. It might be noted, however, that 7 of the 33 crashes (21.2 percent) involved pedestrians in the 65-and-over age group, compared to 3.4 percent in the exposure sample. The five-category collapsed distributions differed significantly (χ2 4df = 11.00, p = .027). Of the crash-involved pedestrians, 30.3 percent were in the 51-and-over age category, compared to 13.2 percent in the exposure sample.
Unmarked crosswalks on two-lane roads. Only 21 pedestrian crashes occurred in this subset. Again, five-category age distributions were used for the statistical test. While the percentage of crash-involved pedestrians in the oldest age category (51 and older) was higher than that of the exposure sample (19.1 percent versus 10.8 percent), the distributions overall did not differ significantly (χ2 4 = 4.40, p = 0.354).
Marked crosswalks on multilane roads. Nearly 70 percent of the pedestrian crosswalk crashes occurred in this subset. Comparison of the seven-category age distributions was quite similar to that of the overall samples, with the proportion of young adults being lower in the crash sample and the proportion in the 65+ age group being much higher in the crash sample (18.1 percent versus 2.2 percent. The distributions differed significantly (χ2 6df = 166.88, p = .001).
Unmarked crosswalks on multilane roads. Only 16 pedestrian crashes occurred at unmarked crosswalks on multilane roads, 6 of which involved pedestrians 51 years old or older. A simple comparison of this age category versus younger pedestrians between the two samples yielded a significant result (χ2 1df = 18.48, p = .001). There were 37.5 percent of crashes involving pedestrians 51 and older in the crash sample compared with 8.1 percent of this age group in the exposure sample.
The multilane marked crosswalk subset was further subdivided on the basis of traffic volume (ADT). In the subset with ADT 15,000, there were 39 pedestrian crashes; 10 (25.6 percent) of these involved pedestrians more than 50 years old. Only 13.9 percent of the exposure sample was over 50. A one-degree-of-freedom chi-square test indicated a significant difference (χ2 1df = 4.51, p = .034).
Lowering the ADT cutoff to 12,000 reduced the size of the crash sample to 15. The percentages of pedestrians over 50 in the two samples were essentially unchanged (26.7 percent versus 13.9 percent), but with the smaller sample size the difference was no longer significant (χ2 1df = 2.04, p = .1540).
In summary, older pedestrians were more at risk than younger pedestrians on virtually all types of crosswalks. This difference seemed most pronounced for marked crosswalks on multilane roads with high traffic volumes (ADT above 12,000), where crash occurrence was highest.
Data were collected on the condition of marked crosswalks. Conditions were coded as E (excellent), G (good), F (fair), and P (poor). This variable was entered as a class variable in the model for crashes on marked crosswalks to assess its effect on crashes. The estimated effect was not statistically significant (p = .1655).
Furthermore, there is no assurance that the condition of the crosswalk markings was consistent over the data collection period.
Overall, crashes tended to be more severe in marked crosswalks on multilane roads, but sample sizes were too small to draw any firm conclusions in that regard. In particular, there were six fatal crashes in marked crosswalks and none in unmarked crosswalks. The fatal crashes all occurred on multilane roads with traffic volumes greater than 12,000 ADT (5 with ADT > 15,000). Crash severity distributions did not differ significantly between marked and unmarked crosswalks on two-lane roads, based on a P 2-statistic comparing A or B level injury crashes with lesser or no injuries (χ2 1df = .268, p = .604). Similarly, on multilane roads with ADT < 12,000, the P 2-statistic and p-value (χ2 1df = .210, p = .647) showed no significant difference.
Previous models shown in this report used subgroups of the 2,000 crosswalks and modeled marked and unmarked separately. A final model (which incorporates the aforementioned results) also was fitted to all 2,000 crosswalks, and it includes direct correlation or matching of marked and unmarked crosswalks. To develop the final model form, generalized estimating equations (GEEs) were used, since they provide a practical method to analyze correlated data with reasonable statistical efficiency. PROC GENMOD uses GEE and permits the analysis of correlated data. Another feature of the final model is that the distribution of pedestrian crashes at a crosswalk is assumed to follow a negative binomial distribution. The negative binomial is a distribution with an additional parameter (k) in the variance function. PROC GENMOD estimates k by maximum likelihood. (Refer to McCullagh and Nelder (chapter 11), (26) Hilbe, (27) or Lawless (28) for discussions of the negative binomial distribution.)
The final model is a negative binomial regression model that was fitted with the observed number of pedestrian crashes as the dependent measure. A negative binomial model is an extension of traditional linear models that allows the mean of a population to depend on a linear predictor through a nonlinear link function and allows the response probability distribution to be a negative binomial distribution. PROC GENMOD is capable of performing negative binomial regression GENMOD using GEE methodology. (29)
The final model uses the observed number of pedestrian crashes at a crosswalk as the dependent measure. The independent measures are estimated average daily pedestrian volume (pedestrian ADT), average daily traffic volume (traffic ADT), an indicator variable for marked crosswalks (CM); two indicator variables for number of lanes (one that indicates two travel lanes, L2; the other indicates three or four travel lanes, L4); and two indicators for median type (no raised median, Mnone , and raised median, Mraised ).
There are two interactions in the model. The first interaction in an interaction between pedestrian ADT and the indicator for marked crosswalk, ADP*CM. The second interaction in the model is between traffic ADT and the indicator for marked crosswalk, ADT*CM.
The linear predictor has the form:
where i is the linear predictor for site i = 1 ,2, ..., 2,000. The number of years of accident data available for a site is used as an offset. 0, 1, ... , 9 are parameters to be estimated. The estimates of the parameters were obtained using PROC GENMOD. Parameter estimates for the final model are shown in table 9.
*S.E. = Standard Error
The final model provides a framework to test the hypothesis of whether marked crosswalks have the same expected number of pedestrian crashes in 5 years controlling for the effects of pedestrian ADT, vehicle traffic ADT, number of lanes, and presence of a raised median. Because the interaction between traffic ADT and the indicator for marked crosswalk, ADT*CM ($ 9), was statistically significant, it was concluded that the presence of a marked crosswalk increases the expected number of pedestrian crashes in 5 years; however, the effect size is dependent on the traffic ADT and number of lanes.
There is also a statistically significant interaction between pedestrian volume and the indicator for marked crosswalk, which was interpreted as the effect size of the presence of a marked crosswalk as dependent on the pedestrian volume. The lane indicator variables compare two lanes with five or more, and three or four lanes with five lanes or more. A two-degrees-of-freedom test for any lane effect has an associated p-value of 0.1071. The two median variables compare no median with other median, and raised median with other median. A two-degrees-of-freedom test for any median effect has an associated p-value of 0.0531. The number of lanes, type of median, pedestrian volume, and ADT are all intracorrelated. This correlation is evidenced by the fact that ADT increases as the number of lanes increases. Also, sites with two lanes do not have a median. The number of lanes was also included in the model and probably is expressed indirectly through ADT and median type. In the final model form, the regional effect was only marginally significant, and including the regional variables (i.e., western versus eastern region) into the model had virtually no influence on the crash effects of the other variables. Thus, the regional variable was not included in the final model.
Further discussion of the final model relative to the goodness-of-fit measures, residuals, and possible biases of multicollinearity is contained in appendix B. In short, the final model was found to be valid and appropriate for the available database. A considerable amount of data exploration was also conducted during the analysis phase of study before developing the final model.
The final pedestrian crash prediction model can be illustrated by inputting various values of pedestrian ADT, traffic ADT, number of lanes (two lanes, four lanes, or more), and median type (raised median or no raised median). All values used in the following figures (and in appendix B) are well within the actual distributions of the data sample.
Figures 13 through 17 and the figures in appendix C (figures 45 through 64) all contain plots of response curves based on the final negative binomial prediction model. Each of these graphs shows a solid line for both marked and unmarked locations. For each solid line, there is a dashed line above and below it representing the upper and lower bounds of the 95 percent confidence intervals.
The relationship of pedestrian crashes in a 5-year period is shown in figure 13 for a range of pedestrian ADTs for traffic ADT of 5,000 using the final crash prediction model. Notice that there is no difference in predicted pedestrian crashes in marked versus unmarked crosswalks for these conditions.
Plots of pedestrian crashes in a 5-year period from the model are shown for two-lane roads as a function of traffic ADT in figure 14 (where pedestrian ADT = 300). Note that there is little if any difference in pedestrian crashes between marked and unmarked crosswalks, even for traffic ADTs as high as 15,000. In fact, for marked crosswalks with traffic ADT of 15,000 and 300 pedestrians per day, expected pedestrian crashes are 0.10 per 5 years, or 1 pedestrian crash per 50 years per site.
Figure 15 illustrates the predicted pedestrian crashes for a five-lane pedestrian crossing with no median and a pedestrian ADT of 250. As traffic ADT increases, pedestrian crashes stay relatively consistent on unmarked crosswalks (approximately 0.10 or less per 5 years). However, on marked crosswalks, pedestrian crashes increase as traffic ADT increases.
Plots of the final model are given for five-lane crosswalks with a raised median in figures 16 and 17. Average pedestrian ADT is plotted versus pedestrian crashes in figure 16 for traffic ADT of 10,000, and there is little difference in pedestrian crashes at marked versus unmarked crosswalks. Note in figure 17, however, that marked crosswalks have an increasingly greater number of pedestrian crashes than unmarked crosswalks, as ADT increases from 15,000 to 50,000.
Figure 13. Predicted pedestrian crashes versus pedestrian ADT for two-lane roads based on the final model.
Figure 14. Predicted pedestrian crashes versus traffic ADT for two-lane roads based on the final model (pedestrian ADT = 300).
Figure 15. Predicted pedestrian crashes versus traffic ADT for five-lane roads (no median) based on the final model.
Figure 16. Predicted pedestrian crashes versus pedestrian ADT for five-lane roads (with median) based on the final model.
Additional plots of pedestrian crashes using the final crash prediction model are given in appendix C for various combinations of the input variables. Tables of estimated pedestrian crashes per 5-year period are given in appendix D using the final model and inputting various combinations of traffic ADT, pedestrian ADT, numbers of lanes, and median type. Table 10 provides estimated pedestrian crashes for marked and unmarked five-lane crossings with a raised median. For example, from table 10, consider a marked crosswalk on a five-lane road (with a raised median) with 150 pedestrian crossings per day and a traffic ADT of 28,000. There would be 0.20 expected pedestrian crashes per 5-year period, or 1 pedestrian crash every 25 years, unless a pedestrian crossing improvement (e.g, traffic signals with pedestrian signals if warranted) is installed. In all cases, values of input variables are chosen well within actual ranges of the study database. A detailed discussion of potential pedestrian safety improvements at uncontrolled locations is in chapter 4 of this report.