U.S. Department of Transportation
Federal Highway Administration
1200 New Jersey Avenue, SE
Washington, DC 20590
202-366-4000


Skip to content
Facebook iconYouTube iconTwitter iconFlickr iconLinkedInInstagram

Federal Highway Administration Research and Technology
Coordinating, Developing, and Delivering Highway Transportation Innovations

 
REPORT
This report is an archived publication and may contain dated technical, contact, and link information
Back to Publication List        
Publication Number:  FHWA-HRT-16-057    Date:  December 2016
Publication Number: FHWA-HRT-16-057
Date: December 2016

 

Cooperative Adaptive Cruise Control Human Factors Study: Experiment 2—Merging Behavior

 

CHAPTER 3. RESULTS

The results related to CACC and workload, physiological arousal, merge behavior, driving performance, distraction, and visual fixations are presented in this chapter.

WORKLOAD

The NASA-TLX was administered verbally at three points during the experiment: shortly after the first merge (exit 4), roughly halfway between the third and fourth merges (exit 14), and shortly after the fourth merge (exit 16). The effects of CACC and longitudinal assistance on workload were tested using generalized estimating equations (GEEs); normal response distribution, identity link function) with location as a repeated measure and experimental treatment condition as the between-group factor of interest. Resulting mean estimates with 95‑percent confidence intervals for each condition and location are shown in figure 1.

This figure is a bar graph displaying National Aeronautics and Space Administration Task Load Index (NASA-TLX) values. The values are grouped by exit numbers 4, 14, and 16 on the x-axis, while estimated workload (NASA-TLX) is on the y-axis from 0 to 60. Three bars are shown: control, cooperative adaptive cruise control (CACC) without MA, and CACC with MA. NASA-TLX scores are markedly greater for the control condition as compared to the CACC with MA and CACC without MA groups. Mean values at exit 4 are 58.96 for control, 19.47 for CACC with MA, and 35.38 for CACC without MA. Mean values at exit 14 are 40.82 for control, 16.32 for CACC with MA, and 18.23 for CACC without MA. Mean values at exit 16 are 37.27 for control, 17.63 for CACC with MA, and 19.94 for CACC without MA.
MA = Merge assist.

Figure 1. Graph. Estimated mean workload (NASA-TLX) by treatment group and location.

As expected, experimental treatment condition significantly affected workload as measured by the NASA-TLX (χ2(2) = 32.76, p < 0.001). The mean NASA-TLX value for the control condition (M = 45.94) was significantly greater than both the with CACC merge assist (M = 17.40) and without CACC merge assist groups (M = 24.52; Dunnett-adjusted p-value, pD < 0.002). The with and without groups were not significantly different from one another (pD > 0.05). Location within the drive also significantly affected reported workload level (χ2(2) = 23.84, p < 0.01). The mean NASA-TLX value at exit 4 (M = 37.63) was significantly greater than both exit 14 (M = 25.20) and exit 16 (M = 25.03; pD < 0.002). However, the mean NASA-TLX scores did not differ between exits 14 and 16. Participants reported higher workloads near the beginning of the drive than near the end of the drive despite the fact that the NASA-TLX was assessed during a straight away and after a merge.

As shown in figure 1, a significant location-by-condition interaction was found (χ2(4) = 11.39, p = 0.02). In order to more closely look at this interaction, each location was explored separately. Consistently across all three NASA-TLX assessment locations, condition significantly affected perceived workload; Exit 4 (χ2(2) = 32.51, p < 0.001), Exit 14 (χ2(2) = 16.02, p = 0.003), Exit 16 (χ2(2) = 8.89, p = 0.012). Within each of the assessments, the control condition had a significantly greater mean NASA-TLX score than the CACC with merge assist and CACC without merge assist groups (all pD < 0.05). The interaction is the result of a difference in the first NASA-TLX assessment scores at exit 4. At this exit, the CACC without merge assist group has significantly greater workload scores than the CACC with merge assist group. This difference did not surface at the exit 14 and exit 16 assessments.

PHYSIOLOGICAL AROUSAL

The physiological measures of arousal assessed were GSR, eyelid opening, and pupil diameter. Each of these metrics naturally varied among participants, so raw values were converted to standardized z-scores. Data were sampled at approximately 120 Hz. (Some variation in sampling rate occurred as a result of occasional degraded signal quality.) The combined data sampling rate and length of each participant’s driving session (30 to 35 min) generated an extremely large set of data. In order to better manage and grasp the datasets, the following analyses focus on the eight periods of interest identified in table 1.

GSR

GSR is generally considered to be sensitive to sympathetic nervous system arousal, and it is more sensitive to spikes in arousal than it is to gradual changes in arousal for longer periods of time. If merging is indeed stressful, higher levels of GSR should be seen for the merging periods (2, 4, 6, and 8) relative to the cruising periods (1, 3, 5, and 7). Furthermore, the drivers in the control condition should also exhibit greater levels of GSR relative to those who used the CACC system as a result of the arousal reducing effect of the automation.

Data from two participants in the control condition yielded GSR data that was not usable and, as a result, was eliminated from GSR analyses.

Mean-standardized GSR scores were analyzed using GEE (normal response distribution, identity link function, etc.). Resulting mean estimates with 95-percent confidence intervals for each condition and period are shown in figure 2.

This figure shows two bar graphs displaying standardized galvanic skin response (GSR) scores for pre-merge periods (graph on the left) and post-merge periods (graph on the right). In the pre-merge periods graph, pre-merge periods are on the x-axis and include, 1, 3, 5, and 7. Estimated GSR (z-score, conductance) is on the y-axis from -1.6 to 1.2. The values tend to stay below the zero mark. Specifically, the mean GSR values by period are -1.01 at period 1, -0.43 at period 3, -0.21 at period 5, and 0.27 at period 7. For the post-merge periods graph, post-merge periods are on the x-axis and include, 2, 4, 6, and 8. Estimated GSR (z-score, conductance) is on the y-axis from -1.6 to 1.2. The values remain above the zero mark. The mean GSR values by period are, 0.02 at period 2, 0.30 at period 4, 0.39 at period 6, and 0.93 at period 8.

Figure 2. Graph. Estimated mean GSR (z-score, conductance) by period.

Time period in the drive significantly affected GSR (χ2(7) = 139.16, p < 0.001). As shown in figure 2, overall, GSR was significantly greater during merge periods than preceding cruise periods. In other words, participants were more aroused during the merge periods than during the cruise periods.

However, participant condition did not significantly affect mean GSR values (χ2(2) = 1.49, p > 0.05). That is, the presence of CACC did not significantly influence arousal as assessed by GSR. No significant interaction between participant condition and time period was found (χ2(14) = 17.26, p > 0.05).

Eyelid Opening

Eyelid opening is often associated with reduced alertness levels. In other words, as people become more relaxed or tired, eyelids tend to droop. If CACC reduces alertness, one might expect eyelid opening (measured in millimeters) to be smaller as the eyelids begin to droop with lower arousal levels (especially over time). As with GSR, the raw eyelid-opening measures were converted to z-scores. Eyelid-opening observations that the eye-tracking software classified with a quality rating less than 75 percent were excluded. This resulted in the exclusion of 39 percent of the eyelid-opening readings.

GEE (normal response distribution, identity link function, etc.) was used to assess the influence of experimental condition, period, and their interaction on eyelid opening. Overall, experimental condition did not significantly affect eyelid opening (χ2(2) = 2.16, p > 0.05), nor did the time period significantly influence eyelid opening (χ2(7) = 6.02, p > 0.05). The interaction between time period and condition was not significant (χ2(14) = 15.28, p > 0.05).

Pupil Diameter

Pupil diameter measurements for which the eye-tracking system reported less than 75 percent quality were excluded from the analysis. As with GSR and eyelid opening, each participant’s pupil diameter observations across the eight 15-s periods were converted to z-scores. Once again, a GEE was used to assess the influence of experimental condition, period, and their interaction on pupil diameter. Resulting mean estimates with 95-percent confidence intervals for each condition and period are shown in figure 3.

This figure shows two bar graphs displaying standardized estimated pupil diameter scores for pre-merge periods (graph on the left) and post-merge periods (graph on the right). In the pre-merge periods graph, pre-merge periods are on the x-axis and include 1, 3, 5, and 7. Estimated pupil diameter (z-score) is on the y-axis from -0.4 to 0.8. The values stay below the zero mark. Specifically, the mean pupil diameter values by period are -0.01 at period 1, -0.09 at period 3, -0.22 at period 5, and -0.12 at period 7. In the post-merge periods, post-merge periods are on the x-axis and include 2, 4, 6, and 8. Estimated pupil diameter (z-score) is on the y-axis from -0.4 to 0.8. The values tend to remain above the zero mark. The mean pupil diameter values by period are 0.52 at period 2, 0.08 at period 4, 0.10 at period 6, and 0.30 at period 8.

Figure 3. Graph. Estimated mean pupil diameter (z-score, conductance) by period.

Time period in the drive significantly affected pupil diameter (χ2(7) = 44.12, p < 0.001). Mean pupil diameter was significantly greater in period 2 than all other periods (except period 8; p < 0.05). This suggests that during the first merge of the experimental drive, participants were more alert despite the practice drive merging. In addition, the mean pupil diameter was significantly greater during period 8 than all other periods except periods 2 and 4 (p < 0.05). While this result may seem somewhat surprising, it may be an artifact of the experimental design. Participants were aware that this was the last merge and that the experimental session would soon be over. As such, participants may simply have been more alert in anticipation of the completion of the task.

The participant experimental condition did not significantly affect pupil diameter (χ2(2) = 3.30, p > 0.05). The interaction between time period and experimental condition was also not significant (χ2(14) = 19.75, p > 0.05).

DISTRACTION

The NASA-TLX assessment indicated that the CACC system with merge assistance reduced workload as compared with the control condition. Despite this, no differences were found in physiological arousal levels between the experimental groups. However, drivers can engage in activities to mitigate the tendency toward reduced arousal on long drives by engaging in arousal-stimulating secondary activities. In this experiment, participants were not discouraged from engaging in these activities. Care was also taken to avoid encouraging these activities, although participants were told that they could listen to the car radio or do what they normally do while driving.

To explore potential engagement in other arousal-increasing tasks, non-driving activities were recorded during two segments in the drive. The first segment was the 30 s prior to the beginning of the first exit maneuver (exit 4). The second segment was 30 s near the exit 13 overhead sign. Table 3 shows the non-driving activities participants engaged in at locations.

Table 3. The number of participants engaging in observable non-driving related activities
by experimental condition group, combined across both observation periods.

Non-Driving-Related Activity Control With CACC Merge Assist Without CACC Merge Assist
Listening to radio 7 8 9
Talking/singing 0 3 2
Listening to radio and talking 0 2 2
Moving hand away from steering wheel 9 3 4
Moving hand away from steering wheel and listening to radio 6 1 2
Talking and moving hands 0 0 2
Talking, moving hands, and listening to radio 0 0 1
Listening to radio, pushing buttons on radio, and moving hands away from steering wheel 0 1 0
Listening to radio and pushing buttons on radio 1 0 0
Total 23 18 22

 

As a result of the relatively small occurrences of the different types of potentially distracting non-driving activities, all were collapsed by group into a single number for analysis. Experimental condition did not significantly affect the number of non-driving related activities that participants engaged in (χ2(2) = 1.10, p > 0.05). Similarly no difference between the two observation periods was found (χ2(1) = 0.62, p > 0.05). Nor was the interaction between experimental condition and observation period significant (χ2(2) = 1.63, p > 0.05).

MERGE BEHAVIOR

Drivers’ actions during each merge were closely monitored to detect differences in driver behavior both over time and as a result of experimental condition. These behaviors included merge success and position, gap selection, and the distance used to complete the merge. The following analyses are not based on the eight previously defined driving segments but rather on the merges themselves. The beginning of each merge was defined consistently across all participants as the moment when the driver passed a specified point on the on-ramp (shortly after passing through the signalized intersection); merge endings were defined as the moment when half of the driver’s vehicle was laterally inside of the CACC platoon in the main lane of traffic.

Merge Success

As in the real world, a successful merge is one in which the driver avoids colliding with either vehicle defining the selected gap. As shown in table 4, several participants experienced a collision in the first merge attempt (despite the preceding practice drive), but the collision rate reduced with subsequent merges.

Table 4. Frequency of collisions by treatment group and merge number.

Condition Merge 1 Merge 2 Merge 3 Merge 4
Control 9 2 2 1
CACC without 5 1 1 3
CACC with 0 0 0 0

 

It should be noted that a single participant in the CACC without merge assist group reengaged the CACC system earlier than instructed during the second and fourth merges. This participant turned on the CACC while still on the acceleration ramp. As a result, the participant did not appropriately adjust speed in order to find an appropriate gap and collided with other vehicles during both of these merges.

Because none of the drivers in the CACC with merge assist group collided with another vehicle, this group was excluded from the GEE model (binomial response distributions and logit link functions) analysis. That is, only the control and CACC without merge assist conditions were included. (It should be noted that if the drivers in the CACC with merge assist condition did not override the system or lose control of the vehicle, then it was not possible to collide with another vehicle in the simulation.) As one would expect, the analysis revealed no difference in collision rates between the control and CACC without merge assist groups. (Both groups were required to control speed and steering during the merge; χ2(1) = 0.15, p > 0.05.) However, the merges themselves did influence the likelihood of a collision, (χ2(3) = 15.16, p = 0.002). More specifically, participants were more likely to experience a crash during the first merge than in the follow three subsequent merges (Bonferroni correction for multiple comparisons, p < .005). It appears that with practice, drivers became more familiar with and better at merging into traffic in the simulator. One might expect similar patterns in the real world.

The interaction between participant condition and merge number was not significant in its influence on collisions during the merge (χ2(3) = 2.38, p > 0.05).

Merge Gap Position

Merge position describes the location where the selected gap drivers chose to insert themselves within the platoon. Here it is defined as the ratio between (1) the distance between the front bumper of the participant and the rear bumper of the vehicle ahead and (2) the distance between the rear bumper of the participant and the front bumper of the following vehicle. In other words, values closer to zero reflect merges closer to the front of the gap (i.e., closer to the vehicle ahead), and a value of 0.5 reflects a perfectly centered gap.

The algorithm used to control vehicle speed for those drivers in the CACC with merge assist was designed to place participant vehicles equally distant between two vehicles, allowing for a simple merge with only lateral adjustment in position.

GEE (normal response distribution and identity link function) modelling found a significant effect of treatment group (χ2(2) = 11.29, p = 0.004). Perhaps surprisingly, the participants in the control condition tended to merge in a similar location in the gap as those people driving in the CACC with merge assist group. In contrast, the CACC without merge assist group entered the gap significantly closer to the vehicle in front of the participant vehicle in the platoon than both the control and CACC with merge assist groups (p < .005).

Merge number did not significantly affect merge gap position (χ2(3) = 5.49, p > 0.05). The interaction between merge number and participant treatment condition was also not significant (χ2(6) = 7.64, p = 0.27). Figure 4 shows the mean merge gap ratios by condition.

The bar graph highlights each group’s mean merge position relative to a perfectly centered merge. The three treatment groups are on the x-axis (control, cooperative adaptive cruise control (CACC) without MA, and CACC with MA), and estimated merge position is on the y-axis and includes (from bottom to top) car ahead, 0.25, 0.50, 0.75, and car following. The graph highlights that the CACC without MA group entered the vehicle platoon significantly closer to the car ahead than did the other groups. Mean merge position values are 0.66 for the control group, 0.62 for the CACC without MA group, and 0.67 for the CACC with MA group.

Figure 4. Graph. Estimated mean merge position by treatment group.

Gap Selection

All CACC drivers (both without and with longitudinal merge assistance) were presented the constant repeating 1.1-s merge gaps. Control drivers, on the other hand, were presented with the following repeating sequence of possible merge gaps (minimum and maximum underlined):

  1. 0.7 s.

  2. 1.1 s.

  3. 1.5 s.

  4. 0.9 s.

  5. 1.4 s.

  6. 1.2 s.

  7. 0.8 s.

  8. 1.0 s.

  9. 1.3 s.

The platoon into which all drivers merged was moving at a nearly constant speed of 112.65 km/h (70 mi/h), meaning that a 0.1-s difference in merge gap is equivalent to 3.13 m (10.27 ft).

Gap selection among control drivers was modelled using GEE (normal response distribution and identity link function) as a function of merge number and prior gap (geographically, the gap behind the selected gap). The merge number did not affect the gap selected (χ2(2) = 2.88, p = 0.24), suggesting that participants did not modify their gap selection criteria after repeating the task several times. A prior gap was found to have a significant effect (χ2(1) = 6.28, p = 0.01), but this is believed to be an artifact of the experimental design: the model produced a negative coefficient on prior merge gap, indicating that an increase in the prior gap was associated with a decrease in the selected gap. However, the sequence of gaps presented is guaranteed to produce this decision in six of the nine possible combinations.

Table 5 shows the gaps presented to participants (repeated in this order) in the control group and the corresponding number of times it was selected. While the order in which the gaps were presented was not random, a chi-square revealed that no significant difference in the gap selected was found (χ2(8) = 11.41, p > 0.05). This indicates that participants were likely to enter the roadway at the gap presented at the bottom of the on-ramp rather than adjust speed and distance to find a more preferred or desirable gap.

Table 5. Gaps presented to the control group and corresponding number of times selected.

Gap (s) Number of Occurrences
0.7 4
1.1 6
1.5 9
0.9 5
1.4 8
1.2 15
0.8 8
1.0 8
1.3 5

Distance Used

The distance required to execute a merge may reflect the ease and/or comfort with which drivers make each merge. A short distance suggests that the driver quickly found and merged into a gap judged to be acceptable. However, a longer distance may suggest less comfort with gaps in the surrounding area or the need to adjust speed significantly to enter the traffic flow.

This distance was modelled with GEE (normal response distribution, identity link function) as a function of treatment group, merge number, and the interaction thereof. Resulting mean estimates with 95-percent confidence intervals for each condition and period are shown in figure 5.

This figure is a bar graph displaying the interaction effects of merge number and experimental condition on the mean distance used to merge. Merge number is on the x-axis from 1 to 4, and estimated merge distance is on the y-axis from 0 to 600 m (0 to 1,969 ft). The three experimental groups are shown: control, cooperative adaptive cruise control (CACC) without MA, and CACC with MA.  Mean distance values at merge 1 are 547.16 m (1,795 ft) for the control group, 543.34 m (1,783 ft) for the CACC without MA group, and 531.24 m (1,743 ft) (for the CACC with MA group. Mean distance values at merge 2 are 530.49 m (1,740 ft) for the control group, 503.31 m (1,651 ft) for the CACC without MA group, and 464.38 m (1,524 ft) for the CACC with MA group. Mean distance values at merge 3 are 574.26 m (1,884 ft) for the control group, 520.25 m (1,707 ft) for the CACC without MA group, and 491.87 m (1,614 ft) for the CACC with MA group. Mean distance values at merge 4 are 546.51 m (1,793 ft) for the control group, 542.29 m (1,779 ft) for the CACC without MA group, and 469.17 m (1,539 ft) for the CACC with MA group.
1 m = 3.3 ft.

Figure 5. Graph. Estimated mean distance used to merge by merge number and experimental condition.

Participant condition significantly affected the distance used to complete a merge (χ2(2) = 11.29, p = 0.004). On average, both the control (M = 550.08 m (1804.72 ft)) and without merge assistance (M = 527.33 m (1730.09 ft)) used more distance to merge into the flow of traffic than the with merge assist group (M = 489.10 m (1604.66 ft)). Both groups that manually controlled speed used more of the ramp space than the CACC with merge assist group. This may have implications in terms of driver expectation of merge location and optimization of on-ramp throughput.

The distance used to complete the merge did not vary significantly based on merge number (χ2(3) = 5.49, p > 0.05). However, a significant interaction between experimental condition and merge number was found (χ2(6) = 17.33, p = 0.008). In order to further explore the interaction, distance was examined individually at each of the merges. No significant effect of participant condition was found at the first merge (χ2(2) = 0.30, p > 0.05) or the second merge (χ2(2) = 3.26, p > 0.05). However, significant differences between conditions were found at both the third (χ2(2) = 11.40, p = 0.003) and fourth (χ2(2) = 6.09 p = 0.048) merges. In the case of the third merge, the control group used significantly more distance to complete the merge than both the CACC with merge assist and CACC without merge assist groups (all pD < 0.05). At the fourth merge, the control group and the CACC without merge assist groups performed similarly, both using significantly more distance to complete the merge than the CACC with merge assist drivers.

DRIVING PERFORMANCE

The measures of driving performance taken during this experiment were steering entropy and average absolute steering wheel torque. Unlike the physiological metrics, none of these required standardization. The following analyses focus on the previously defined periods of interest.

Steering Entropy

Steering entropy is a technique that captures corrective response and is frequently used to assess driver inattentiveness. One might expect higher levels of inattention in the groups with CACC because these drivers were not required to maintain speed and may have subsequently had more resources available to allocate to other tasks (including mind wandering).

Steering entropy was calculated for each subject within each 15-s cruising period (i.e., the periods where drivers were not expected to actively adjust steering to enter the platoon). Experimental condition was not found to significantly affect steering entropy values (χ2(2) = 5.48, p > 0.05). Further, neither period (χ2(3) = 2.72, p > 0.05) nor the interaction between period and experimental condition (χ2(6) = 4.44, p > 0.05) were significant.

VISUAL FIXATIONS

One way in which visual attention can be inferred is by examining where drivers are looking. Drivers in the CACC with merge assist condition did not need to control speed to successfully merge into the main travel lane. As a result, these drivers may not have felt the need to visually track traffic as closely as the control and CACC without merge assist conditions. This section explores the differences and similarities in glance patterns.

Dynamic regions of interest (ROIs) were established to capture the number and duration of visual fixations to the merge area (the area to the right of the participant’s vehicle while merging into the CACC platoon, which is depicted in figure 6 as a red rectangle). Fixations were defined using eye-tracking software as consecutive gaze points falling within a radius of 1 percent of the projected screen within 300 ms (ending when new gaze points fall outside of an established radius for more than 120 ms).

These three screenshots show the progression of the dynamic region of interest (ROI). As the perspective of the driver moves from the top of the on ramp toward the merge area, the red highlighted ROI moves from the center of the field of view to the right of the field of view. The highlighted area of the dynamic ROI begins small as the driver is at the top of the on ramp (left screenshot), grows to a medium size as the driver is approaching the main travel lane (center screenshot), and encompasses the entire travel lane as the driver merges (right screenshot). The ROI captures where a driver might look to determine if there are potentially conflicting vehicles during a merge.

Figure 6. Screenshots. Illustrated dynamic merge area ROI.

Two different approaches were used to examine glance behavior during merges. In the first approach, the number of fixations to the dynamic ROI was analyzed. In this case, participant condition did significantly influence the number of fixations to the ROI (χ2(2) = 7.12, p = 0.028). The control (M = 2.65) group had significantly more fixations to the ROI than the CACC with merge assist group (M = 2.34, pD = 0.024). The CACC without group (M = 2.41) did not significantly differ from the CACC with group or the control group (pD > 0.05).

In the second approach, the length of each merge was taken into consideration. Recall that significant differences in the amount of time utilized to complete the merges appeared across the different participant conditions. Subsequently, the total amount of time spent looking at the ROIs was divided by the total merge time, thus creating a proportion value. Participant condition did not significantly affect the proportion of time drivers spent looking at the dynamic ROI during the merge (χ2(2) = 2.41, p > 0.05). Similarly, neither the merge number (χ2(3) = 0.05, p > 0.05) nor the interaction between participant condition and merge number (χ2(6) = 10.78, p > 0.05) were significant.

TRUST IN THE CACC SYSTEM

Both the CACC with and without merge assist groups were required to accept some level of trust in the system. Recall that those participants in the CACC with merge assist group were not required to accelerate or brake at any point to successfully complete the drive. Of the drivers in this group, only one ever used their own speed controls to override the system. It is not clear, however, whether this person did not trust the system or simply did not understand the functionality of CACC. Throughout much of the drive, the participant manually controlled speed (by pressing the accelerator and keeping CACC engaged), spun out during the second merge, and did not reengage the system during the fourth merge.

Among participants in the without group, trust was examined during the cruising periods only because speed was manually controlled as drivers prepared to merge. Out of 16 participants in the without merge assist group, two engaged the accelerator pedal during a cruise period (one in period 1 and the other in period 7). However, in both cases, the pedal was used minimally (not necessarily in a manner indicative of distrust in the system) and was possibly the result of simple resting the foot on the pedal.

 

 

Federal Highway Administration | 1200 New Jersey Avenue, SE | Washington, DC 20590 | 202-366-4000
Turner-Fairbank Highway Research Center | 6300 Georgetown Pike | McLean, VA | 22101