U.S. Department of Transportation
Federal Highway Administration
1200 New Jersey Avenue, SE
Washington, DC 20590
202-366-4000


Skip to content
Facebook iconYouTube iconTwitter iconFlickr iconLinkedInInstagram

Federal Highway Administration Research and Technology
Coordinating, Developing, and Delivering Highway Transportation Innovations

 
REPORT
This report is an archived publication and may contain dated technical, contact, and link information
Back to Publication List        
Publication Number:  FHWA-HRT-16-036    Date:  April 2016
Publication Number: FHWA-HRT-16-036
Date: April 2016

 

Safety Evaluation of Continuous Green T Intersections

CHAPTER 6. MATCHING

Since the intersections in Florida and South Carolina likely differ with regard to unobservable variables (e.g., reporting thresholds and driver demographics), matching was done separately for each State. As described in the methodology, binary logistic regression was used to estimate the propensity scores for both States. Since the goal of the binary logit model was to yield matches with good covariate balance, the functional form of the variables in the propensity score models differed between Florida and South Carolina.

The binary logit models estimated the probability that the intersections were CGT intersections (i.e., the propensity score). The models were estimated at the intersection level (all years of data for each intersection). The propensity score model for Florida is shown in table 10, and the model for South Carolina is shown in table 11. For Florida, there were 68 observations, the pseudo R2 was 0.1850, and the log-likelihood was -37.546. For South Carolina, there were 37 observations, the pseudo R2 was 0.3318, and the log-likelihood was -16.911.

Table 10. Florida propensity score model.
Variable Coefficient Standard Error p-Value
AADTThrough 0.00009 0.00003 0.010
AADTIntersecting 0.00001 0.00003 0.743
AADTmiss -1.09534 0.99280 0.270
THRU_RLT 0.13667 0.67546 0.840
INT_RTL -0.29956 0.83036 0.718
RAILCROSS 2.53527 1.60811 0.115
FRTH_LEG -0.03758 0.95639 0.969
CURVE 0.99022 0.70849 0.162
SKEW 0.02244 0.02772 0.418
THRU_SW 0.99344 0.79216 0.210
INT_SW -0.48141 0.66091 0.466
Intercept -3.33964 1.56424 0.033

 

Table 11. South Carolina propensity score model.
Variable Coefficient Standard Error p-Value
LN_AADTThrough 1.49003 1.37409 0.278
LN_AADTIntersecting -0.70140 0.41815 0.093
THRU_RLT -0.19744 0.92572 0.831
INT_RTL 0.53584 1.55880 0.731
FRTH_LEG 1.73509 1.14918 0.131
CURVE -0.93645 1.36229 0.492
Through_Shoulder -0.24410 1.50289 0.871
Intercept -10.5416 13.05849 0.420

 

The distributions of the estimated propensity scores for Florida and South Carolina (by intersection type) are shown in figure 17. The box plot of distributions of propensity scores for the unmatched groups show that the ranges of values were similar for Florida and dissimilar for South Carolina. Since the sample size for both States was small, the amount of overlap in the propensity score distributions between CGT and comparison intersections was not large enough to obtain covariate balance using NN matching. When NN matching was used, there were fewer than 10 total intersections from both States combined that matched well based strictly on the estimated propensity scores.

These two graphs show estimated propensity scores by State and treatment status. The graph on the left shows unmatched propensity score distributions for South Caroline. Count is on the y-axis (0-3 above the x-axis indicating the count for the untreated group, 0-3 below the x-axis indicating the count for the treated group), and propensity score is on the x-axis from 0.01 to 0.99. The graph on the right shows unmatched propensity score distributions for Florida. Count is on the y-axis (0-3 above the x-axis indicating the count for the untreated group, 0-3 below the x-axis indicating the count for the treated group), and propensity score is on the x-axis from 0.01 to 0.99. For both graphs, two types of bars are shown: untreated and treated.

Figure 17. Chart. Plots of estimated propensity scores by State and treatment status.

Since NN matching did not yield the desired covariate balance without a significant reduction in sample size, Mahalanobis matching was implemented. For the Mahalanobis matching using the Florida data, the propensity score, through and intersecting road traffic volumes, through and intersecting road posted speed limits, and intersecting road shoulder width were included as covariates. For Mahalanobis matching using the South Carolina data, the natural log of through and intersecting road traffic volumes, through and intersecting road posted speed limits, and through and intersecting road lane and shoulder widths were included as covariates. Replacement was allowed. No CGT intersection was dropped from the dataset. The majority of comparison intersections were not duplicated for replacement. (Eight intersections were used more than once: seven in Florida and one in South Carolina.)

The matching results for both States indicated no significant differences for the majority of the covariates based on the standardized bias. Plots of the absolute standardized bias for each of the covariates are shown for Florida and South Carolina in figure 18 and figure 19, respectively. For Florida, the variables with significant bias (greater than 25 percent) remaining after matching included the through road traffic volumes and posted speeds. For South Carolina, the variables with significant bias remaining after matching included the natural log of through and intersecting road traffic volumes, as well as the through road posted speeds. The Mahalanobis matching was effective at removing the bias in all of the other observed covariates. The variables with significant bias remaining were included in each of the CMF models (added as predictor variables) to account for the differences in the CGT and comparison intersections. By adding the variables with significant remaining bias to the regression model as a predictor variable, the regression model adjusts for the remaining differences in the data for the treated and untreated intersections.

This bar graph shows the absolute standardized bias for covariates in Florida. The y-axis is labeled absolute standardized bias and ranges from 0 to 125 in increments of 25. The x-axis is labeled variables and includes 14 variables: AADTThrough, AADTIntersecting, AADTMiss, THRU_RLT, INT_RTL, FRTH_LEG, CURVE, SKEW, THRU_SPEED, INT_SPEED, THRU_LW, THRU_SW, INT_LW, and INT_SW. Two types of bars are shown: before matching and after matching. Each variable shows both bars, except FRTH_LEG, which only has a before matching bar (due to no bias remaining after matching). When the gray bars are smaller than the black bars, the amount of bias for the related variables are reduced. When the gray bars are larger than the black bars, the bias for the variables increased with matching.

Figure 18. Graph. Absolute standardized bias for covariates in Florida data.

 

This bar graph shows the absolute standardized bias for covariates in South Carolina. The y-axis is labeled absolute standardized bias and ranges from 0 to 125 in increments of 25. The x-axis is labeled variables and includes 14 variables: AADTThrough, AADTIntersecting, AADTMiss, THRU_RLT, INT_RTL, FRTH_LEG, CURVE, SKEW, THRU_SPEED, INT_SPEED, THRU_LW, THRU_SW, INT_LW, and INT_SW. Two types of bars are shown: before matching and after matching. Each variable shows both bars except for FRTH_LEG, CURVE, and THRU_SW which only have before matching bars (due to no bias remaining after matching). When the gray bars are smaller than the black bars, the amount of bias for the related variables are reduced. When the gray bas is larger than the black bars, the bias for the variable increased with matching.

Figure 19. Graph. Absolute standardized bias for covariates in South Carolina data.

Since there were still a number of covariates that were not balanced based on the standardized bias measures, genetic matching was also implemented to improve the matching. K-S tests were then done using the entire sample (both states combined) to help determine the level of covariate balance (for both matching methods and the unmatched data). The results of the tests are shown in table 12.

 

Table 12. K-S test results.
Matching Scheme AADTThrough AADTIntersecting AADTMiss THRU_SPEED INT_SPEED INT_LW INT_SW
Unmatched < 0.0001 0.009 0.237 < 0.0001 0.032 0.077 0.097
Mahalanobis matching < 0.0001 0.04 0.713 < 0.0001 0.06 0.617 0.577
Genetic matching 0.081 0.083 0.104 0.073 0.05 0.801 0.098
Matching scheme IntNumLane THRU_LW THRU_SW ThruNumLane FRTH_LEG CURVE DEFLECTION
Unmatched 0.265 0.001 0.133 < 0.0001 0.003 0.676 0.007
Mahalanobis matching 0.052 0.053 0.051 < 0.0001 0.126 1 0.036
Genetic matching 0.259 0.078 0.627 < 0.0001 0.02 0.955 0.133
Bold = Statistically significant difference (i.e., p-Value ≤ 0.05).

 

As shown in table 12, the K-S tests indicate that the following variables were all significantly different at the 95-percent confidence level using Mahalanobis matching:

The following variables were all marginally different when using Mahalanobis matching:

When using genetic matching, the covariate balance was improved over the Mahalanobis matching. The results of the genetic matching indicate that the only variables that were significantly different between the two groups were INT_SPEED, TruNumLane, and FRTH_LEG.

 

 

Federal Highway Administration | 1200 New Jersey Avenue, SE | Washington, DC 20590 | 202-366-4000
Turner-Fairbank Highway Research Center | 6300 Georgetown Pike | McLean, VA | 22101