U.S. Department of Transportation
Federal Highway Administration
1200 New Jersey Avenue, SE
Washington, DC 20590
202-366-4000


Skip to content
FacebookYouTubeTwitterFlickrLinkedIn

Federal Highway Administration Research and Technology
Coordinating, Developing, and Delivering Highway Transportation Innovations

 
REPORT
This report is an archived publication and may contain dated technical, contact, and link information
Back to Publication List        
Publication Number:  FHWA-HRT-16-056     Date:  December 2016
Publication Number: FHWA-HRT-16-056
Date: December 2016

 

Cooperative Adaptive Cruise Control Human Factors Study: Experiment 1—Workload, Distraction, Arousal, and Trust

 

CHAPTER 3. RESULTS

The findings with respect to CACC and workload, physiological arousal, distraction, and crash avoidance are presented separately in this chapter.

WORKLOAD

The NASA-TLX was administered verbally at three points during the test: after an initial merge event, 15 min into the drive, and immediately after the critical event. For the control and CACC with crash conditions, this was after the participant had come to a stop or collided with the preceding vehicle. For the CACC with merge event, this was after the vehicle returned to a speed of 70 mi/h and following distance of 1.1 s. For the CACC with communications failure condition, this was after the CACC system was reengaged and the following distance was again 1.1 s.

The mean NASA-TLX scores by condition and period are shown in figure 7. The test for significant effects in NASA-TLX ratings was a multivariate analysis of variance with location as a repeated measure with three levels and treatment condition as a between-groups factor with four levels. As expected, the control group consistently rated workload higher than the CACC groups (F (1, 3) = 14.5, p < 0.001). There was also a significant location-by-condition interaction (F (6, 90) = 27.4, p < 0.001), which was the result of the CACC with crash group rating their workload higher than the other CACC groups after the critical crash event. The 95-percent confidence limits for the means of the control group are shown in figure 7. The confidence limits of the means for the other groups were of similar magnitude but are omitted from the figure to avoid unnecessary clutter.

This line graph show the overall National Aeronautics and Space Administration Task Load Index (NASA-TLX) score as a function of treatment group and location. Overall NASA-TLX score is on the y-axis from 0 to 100. Location is on the x-axis with three location categories: after merge, mid-cruise, and after final event. There are four lines, one for each participant group: control, cooperative adaptive cruise control (CACC) with crash, CACC with merge, and CACC with system failure. Mean NASA-TLX scores for the control group are 53, 28, and 61 for after merge, mid cruise, and after final event locations, respectively. Mean NASA-TLX scores for the CACC with crash group are 16, 9, and 42 for after merge, mid cruise, and after final event locations, respectively. Mean NASA-TLX scores for the CACC with merge group are 12, 7, and 16 for after merge, mid cruise, and after final event locations, respectively. Mean NASA-TLX scores for the CACC with system fail group are 17, 14, and 24 for after merge, mid cruise, and after final event locations, respectively. Error bars for the 95 percent confidence limits are only shown for the control group. These error range between 12 and 15.
Note: Error bars represent estimated 95-percent confidence limits of the means.

Figure 7. Graph. NASA-TLX scores as a function of treatment group and location in the scenario.

PHYSIOLOGICAL AROUSAL

The physiological measures of arousal were GSR, eyelid opening, and pupil diameter. Because each participant drove for approximately 34 min and the data were sampled at up to 120 Hz, subsequent analyses focused on periods identified in the method section.

GSR

GSR is generally considered to be sensitive to sympathetic nervous system arousal. It is more sensitive to spikes in arousal than to gradual changes in arousal over longer periods of time. Because it is more sensitive to transient spikes than long-term shifts in arousal, if the CACC results in lower levels of arousal, then these would be expected to be most pronounced in comparisons of CACC with the control condition at periods 2 and 3 and differences between the CACC with crash avoidance condition and the control condition for periods 4 and 5.

Because absolute levels of skin conductance can and did vary greatly between individuals, the GSR scores for each participant were converted to standardized z-scores by subtracting the participant’s mean skin conductance from each conductance reading and dividing the difference by the standard deviation of skin conductance.

The resulting z-scores were analyzed using three generalized estimating equation (GEE) models with normal response distributions and identity link functions. The first model analyzed the five 15-s periods as a repeated measure and the four treatment groups as a between-groups measure and yielded no significant main effect or interaction.

A second model combined the three CACC conditions and compared this combined grouping with the control across the first four periods. (The fifth period was omitted because the CACC treatment groups were exposed to different events in that period.) The second model also yielded no significant main effect or interaction.

The third model compared the CACC with crash avoidance group with the control group in the fifth period only, where both of these groups were exposed to the crash event. Again, there was no significant difference in standardized GSR between these groups.

The null hypothesis of no difference in physiological arousal as measured by standardized GSR between the control and CACC groups could not be rejected.

Eyelid Opening

If CACC reduces alertness, then one might expect eyelid opening to become smaller as the eyelids begin to droop with lower arousal levels. This might lead to a smaller eyelid opening for the CACC groups as a function of period, particularly in periods 3 and 4 compared to the other periods, which had potentially stimulating events (i.e., starting the drive, the first merge event, and the critical events).

As with GSR, the raw eyelid-opening measures were converted to z-scores. Eyelid-opening observations that the eye-tracking software classified with a quality rating less than 75 percent were excluded. This resulted in the exclusion of 69 percent of the eyelid-opening readings.

Two GEE models with a normal response distributions and identity link functions were fit to estimate eyelid opening as a function of condition, period, and their interaction. The first model included all groups and periods. There was a significant condition-by-period interaction, χ2(12) = 24.60, p = 0.02. This interaction appeared to be the result of the control group having significantly smaller eyelid openings in period 4. Because of the large variability in eyelid opening, the large amount of excluded data, and the large number of mean comparisons, this finding should not be given much credence.

The second model compared the control group eyelid opening to the CACC with crash avoidance group in period 5, which included the crash avoidance event. The difference between these groups was not significant.

Pupil Diameter

Pupil diameter measurements for which the eye-tracking reported less than 75 percent confidence were excluded from analyses. These exclusions resulted in retention of 73.7 percent of the observations. As with GSR and eyelid opening, each participant’s pupil diameter observation across the five 15-s periods was converted to a z-score. The z-scores were then submitted to a GEE model with condition, period, and their interaction as factors. Figure 8 shows the estimated standardized means as a function of condition and period, where the three CACC groups have been collapsed into one CACC condition. The condition-by-period interaction was significant, (χ2(12) = 36.12, p < 0.01), as was the main effect of period, (χ2(4) = 74.04, p < 0.01). The source of the main effect is obvious—pupil diameters for all conditions were greater during the first two periods (after 5 min of driving) than in the last three periods (after 15 min or more of driving). The interaction does not appear to result from any easily explainable phenomenon; the control group had atypically large pupil diameters in period 2, and the CACC with communications failure group had larger pupil diameters than the other groups in period 4. Because all three CACC groups were exposed to the same stimulus conditions until period 5, there is no obvious explanation for the pattern that resulted in the significant interaction.

This graph shows the mean standardized pupil diameters of the control group and the combined mean of the cooperative adaptive cruise control (CACC) groups for each of the first four measurement periods. The y-axis is labeled “Estimated Mean Pupil Diameter (z-score)” and ranges from -1.40 to 1.40. The x-axis is labeled “Period” with values ranging from 1 to 4. The means and confidence limits are as follows: period 1 control group mean = 0.22, -0.01 to 0.45, period 1 CACC group mean = 0.40, .0.27 to 0.54; period 2 control group mean = 0.84, 0.44 to 1.23, period 2 CACC group mean 0.21, 0.07 to 0.35; period 3 control group mean = -0.34, -0.64 to -0.04, period 3 CACC group mean = -0.46, -.060 to -0.32; period 4 control group mean = -0.35, -0.53 to -0.18, and period 4 CACC group mean = -0.13, -0.32 to -0.06.
Note: Error bars represent estimated 95-percent confidence limits of the means.

Figure 8. Graph. Standardized pupil diameter as a function of condition and period.

Overall, the physiological measures provided no evidence that CACC resulted in a greater reduction in arousal over time than the control condition.

DISTRACTION

The physiological data, which were quite noisy, showed no clear indication of reduced levels of arousal that might lead to inattention errors. However, people can do things to mitigate the tendency toward reduced arousal on long drives by engaging in arousal-stimulating secondary activities. In this experiment, participants were not discouraged from engaging in these activities. Care was taken to avoid encouraging these activities, although participants were told that they could listen to the car radio or do what they normally do while driving. Table 2 shows the non-driving activities participants engaged in at locations that roughly correspond to periods 1, 3, and 4 in the other analyses. Because all of the CACC participants were treated the same in periods 1, 3, and 4, the three CACC groups were collapsed into one group, and their probability of engaging in observable diversions was compared to the control group. A GEE model with group (control versus CACC), period, and the interaction of group and period yielded a significant effect of period, χ2(2) = 6.9, p = 0.04. Although the estimated mean probability of control group members engaging in diversions (0.36) was less than the estimated mean probability of CACC group members engaging in diversions (0.52), this difference was not statistically significant, χ2(1) = 2.75, p = 0.10. However, because the original null hypothesis was unidirectional (i.e., CACC participants would not engage in more diversions than the control group), a one-sided statistical test is appropriate. A one-sided t-test found a significant difference between the groups (p < 0.05). Figure 9 shows the estimated mean probability and confidence limits for the means for engaging in a diversion as a function of period.

Table 2. Number of participants engaging in observable
non-driving-related activities as the experiment progressed.

Non-Driving Related Activity Period 1 Period 3 Period 4
No detectable diversion 29 26 21
Listening to radio 19 20 21
Talking/singing/texting 1 3 6
Listening to video on smartphone 0 0 1

 

This point graph shows the estimated mean proportion of drivers engaged in non-driving-related diversions as a function of measurement period. The y-axis is labeled “Probability of Observable Diversion” and ranges from 0 to 1.0. The x-axis shows three periods: 1, 3, and 4. The period 1 mean is 0.41 with confidence limits of 0.28 and 0.55. The period 2 mean is 0.47 with confidence limits of 0.34 and 0.61. The period 3 mean is 0.57 with confidence limits of 0.43 and 0.70.
Note: Error bars represent estimated 95-percent confidence limits of the means.

Figure 9. Graph. Estimated mean proportion of drivers engaged in
non-driving-related diversions increased with time into drive.

GAZE LOCATION

As indicated in chapter 2, an eye-tracking system automatically recorded the estimated location of gazes to objects coded in its model. Table 3 shows the percentage of time spent gazing at each of the world model objects as a function of treatment group and roadway section. As can be seen, the control group spent considerably more time gazing at the multifunction display than did the CACC groups. Gaze time at the multifunction display appears to come at the expense of monitoring at the road ahead. It should be noted that the “Out of Vehicle” classification included more than just glances to the road ahead. It included any recorded gaze direction other than to the defined objects (e.g., multipurpose display or rear-view mirror) and within the 200- by 40-degree area of the projection screen. Nonetheless, the vast majority of the out-of-vehicle glances were to the road ahead. The “Undetermined” classification in table 3 does not include eye blinks or times when gaze could not be detected for whatever reason (e.g., head down, eyes closed, or hand in front of face), but only cases where the gaze was qualified as good and was not in the direction of any of the defined objects.

Table 3. Percent of gaze time to defined objects as a function of condition and period.
Condition Period Multipurpose Display Instruments Rearview Mirror Left Mirror Right Mirror Undetermined Out of Vehicle
Control 1 3.48 7.08 0.47 0.15 0.06 0.00 88.75
Control 2 11.33 13.45 0.30 0.32 0.06 0.00 74.53
Control 3 7.76 9.02 1.51 0.44 0.05 0.00 81.22
Control 4 3.57 6.49 0.08 0.00 0.00 0.05 89.81
Control 5 3.16 6.42 0.00 0.12 0.00 0.00 90.30
Control Mean 5.86 8.49 0.47 0.21 0.04 0.01 84.92
CACC crash avoidance 1 0.94 3.10 0.03 0.09 0.00 0.00 95.84
CACC crash avoidance 2 1.34 15.55 0.05 0.02 0.00 0.00 83.04
CACC crash avoidance 3 0.79 5.95 0.04 0.35 0.00 0.00 92.87
CACC crash avoidance 4 0.97 5.19 0.00 0.00 0.00 0.00 93.85
CACC crash avoidance 5 3.41 1.21 0.00 0.00 0.00 0.42 94.97
CACC crash avoidance Mean 1.49 6.20 0.02 0.09 0.00 0.08 92.11
CACC cut-in 1 1.44 6.76 0.08 0.13 0.00 0.14 91.44
CACC cut-in 2 1.03 17.44 0.43 0.00 0.01 0.03 81.07
CACC cut-in 3 0.91 1.87 1.04 0.06 0.00 0.00 96.11
CACC cut-in 4 0.85 2.10 0.33 0.00 0.00 0.07 96.64
CACC cut-in 5 1.21 3.63 0.31 0.22 0.00 0.00 94.63
CACC cut-in Mean 1.09 6.36 0.44 0.08 0.00 0.05 91.98
CACC failure 1 0.07 3.10 0.03 0.01 0.00 0.00 96.79
CACC failure 2 2.76 11.76 0.82 0.03 0.00 0.00 84.62
CACC failure 3 0.61 3.27 0.67 0.00 0.00 0.00 95.46
CACC failure 4 5.30 7.05 0.77 0.01 0.00 0.00 86.86
CACC failure 5 3.50 2.32 0.15 0.13 0.00 0.00 93.90
CACC failure Mean 2.45 5.50 0.49 0.04 0.00 0.00 91.52
Grand mean   2.70 6.63 0.35 0.10 0.01 0.04 90.17

 

Because the only difference in treatment of the CACC groups occurred in observation period 5, the data for the three CACC groups were collapsed into a single CACC group for periods 1 through 4. A GEE model with negative binomial response distribution and log link function was used to analyze the gaze distribution among objects in periods 1 through 4 and CACC versus control groups. This model revealed a significant main effect of period (χ2(3) = 19.5, p < 0.01) and condition (χ2(1) = 24.6, p < 0.01). These effects can be observed in figure 10. For the CACC participants, gaze time to the display in periods 2 and 4 may have resulted from the need of some participants to re-engage the CACC system. The large percent of time that the control group spent gazing at the multipurpose display in period 2 is likely the result of the changes in gap caused by the cut-in vehicle in that period.

This point graph shows the generalized estimating equation (GEE) estimated percent of time means for gazing at the multifunction display. The y-axis is labeled “Mean Estimated Percent Time Gazing at Multi-Purpose Display” and ranges from 0 to 25 percent. The x-axis is labeled “Period” and shows four periods numbered 1 through 4. Two types of data are shown: cooperative adaptive cruise control (CACC) and control. CACC means and confidence limits are as follows: period 1 mean = 0.6, 0.2 to 1.3; period 2 mean = 1.6, 0.9 to 2.8; period 3 mean = 0.8, 0.4 to 1.7; and period 4 mean = 2.2, 0.7 to 6.5. Control group means and confidence limits are as follows: period 1 mean =3.8, 2.2 to 6.3; period 2 mean = 12.4, 7.3 to 20.9; period 3 mean = 8.5, 4.3 to 16.6; and period 4 mean = 3.9, 2.0 to 7.4.
Note: Error bars represent estimated 95-percent confidence limits of the means.

Figure 10. Graph. GEE estimated mean percent of time gazing
at the multifunction display as a function of condition and period.

CRASH AVOIDANCE

None of the participants in the CACC with cut-in or CACC with system failure groups collided with another vehicle. This was not the case for participants in the crash avoidance condition in which the lead vehicle of the platoon braked to a stop from 70 mi/h at a rate of 32.2ft/s2. As shown in table 4, five control group members crashed into the vehicle ahead, but only one of the CACC with crash avoidance members crashed. The difference in crash rates was significant by Fisher’s Exact Test (p < 0.02).

Table 4. Crash counts for the two groups that were exposed to the crash avoidance event.

Group Crashed Avoided Total
Control 5 6 11
CACC with crash avoidance 1 12 13

Crashes are often considered the ultimate measure of highway safety. However, crashes are a rather crude safety measure because they are rare outside of driving simulations and are generally reported in terms of number of crashes per 1 million mi driven. Time-to-collision (TTC) is used as a surrogate for crashes because the frequency of near misses (i.e., very short TTCs) is thought be highly correlated with crash frequency but easier to observe.(18) To further evaluate the probability of a crash in scenarios like those in the simulation, TTC was analyzed.

MINIMUM TTC

The classic definition of TTC is range (R) divided by range rate (range rate), where range rate is the difference in between the leading vehicle velocity and the following vehicle velocity. With this classical measure, minimum TTC suffers from a ceiling effect when R is zero (i.e., the following vehicle collides with the lead). To enable analysis of TTC even when collisions occur, Brown’s adjusted minimum TTC was used.(18) The adjusted minimum TTC takes into account velocity at the time of collision. The adjusted minimum TTC thus reflects the severity of the crash or near-crash event regardless of whether or not collision avoidance is successful. Brown describes the adjusted minimum TTC as follows:

The formula for the adjusted minimum TTC, which is only necessary if a collision occurs, is shown in figure 11, where VF and VL are the velocities of the following and lead vehicles, respectively, and aF and aL are the average accelerations of the following and lead vehicles, respectively, from the time the following vehicle begins to brake.

Adj. Min. TTC equals V subscript F minus V subscript L, that difference divided by a subscript F minus a subscript L.

Figure 11. Equation. Adjusted minimum time to collision.

It should be noted that one control participant was missing gap data because of an experimenter error during data collection. For that participant, minimum TTC was calculated by assuming that the initial gap was 1.1 s. Review of the video of this participant confirmed that the participant did not crash and the 1.1-s initial gap assumption was reasonable.

One CACC with crash avoidance participant showed no reaction to the rapid deceleration of the lead vehicle. When the following vehicle failed to decelerate, the adjusted TTC went to negative infinity, and minimum TTC became meaningless, at least in terms of computing mean TTC.(18) Therefore, this participant was excluded from the adjusted minimum TTC analysis.

Because the minimum TTC data did not appear to be normally distributed, a generalized linear model (GLM) with a gamma response distribution and log link function was used for significance testing. Also, because the gamma distribution cannot include negative values, each participant’s minimum TTC was transformed by adding 8.0 s, which shifted the range to 1.4 through 29.4. The overall test showed that the mean minimum TTCs between groups were significantly different (Wald χ2 (3) = 9.2, p <0.03). As can be seen in figure 12, which reflects the original scale (i.e., -8.0), the control group TTC was significantly less than that of the three groups that used CACC.

This point graph shows the estimated means of adjusted time to collision (TTC) for the four treatment groups. The y-axis is labeled “Adjusted Minimum TTC and ranges from -2.0 to 4.0 s. The x-axis is labeled with the names of the four experimental groups: control, cooperative adaptive cruise control (CACC) with crash, CACC with cut-in, and CACC with communications failure. The means and confidence limits are as follows for each of the four groups, respectively: control mean = -0.6 and confidence limits = -1.6 to 0.5; CACC with crash mean = 1.6 and confidence limits = 0.3 to 3.1; CACC with cut-in mean = 2.0 and confidence limits = 0.7 to 3.6; and CACC with communications failure mean = 1.2 and confidence limits = 0.0 to 2.6.
Note: Error bars represent estimated 95-percent confidence limits of the means.

Figure 12. Graph. Estimated adjusted mean TTC.

Minimum TTC depends in part on the driver’s reaction time. In the next section, driver reaction time to the crash event is examined.

REACTION TIME

Brake reaction time is defined as the time between when the car immediately ahead of the participant began braking and the time the participant first began to depress the brake pedal. One control and two CACC crash avoidance participants were excluded from this analysis because they either never braked or swerved out of the travel lane before braking. Despite the large difference in TTC between the control group and the CACC crash avoidance group, which responded to the same event as the control group, there was no significant difference in brake reaction time between these groups; a GLM with a gamma response distribution and inverse link function yielded a Wald χ2(1) = 0.42, p = 0.52. The brake reaction times for these two groups are shown in figure 13. This finding suggests that the better crash avoidance and larger minimum TTCs for the CACC group were the result of the CACC system automatically braking at 0.4 g. Alternatively, the larger CACC TTCs could have resulted if the CACC group had responded with more vigorous braking than the control group (i.e., if the CACC group went from zero to full brake pedal depression faster than the control group). This alternative explanation can be rejected because the control group tended, but not significantly so, to brake more vigorously (i.e., reached full brake depression sooner) than the CACC group. Figure 14 shows the time taken to move the brake pedal position from off to full braking. The difference between groups was not significant; a GLM with a gamma response distribution and inverse link function produces Wald χ2(1) = 3.07, p = 0.08.

This point graph shows the estimated mean brake reaction times for the control and cooperative adaptive cruise control (CACC) with crash groups. The y-axis is labeled “Brake Onset Reaction Time” and ranges from 0 to 4.0 s. The x-axis is labeled with the names of the two groups that were exposed to the crash event: control and CACC with crash. The estimated means and confidence limits are as follows: control mean = 2.6 and confidence limits = 1.6 to 3.6 and CACC with crash mean = 2.8 and confidence limits = 1.8 to 3.8.
Note: Error bars represent estimated 95-percent confidence limits of the means.

Figure 13. Graph. Estimated mean brake onset reaction times for the two groups that had the crash avoidance final event.

 

This point graph shows the estimated means for the lag between the onset of braking and when full brake pedal depression occurred. The y-axis is labeled “Full Brake Depression Lag” and ranges from 0 to 1.6 s. The x-axis is labeled with the names of the two groups that were exposed to the crash event: control and cooperative adaptive cruise control (CACC) with crash. The estimated means and confidence limits are as follows: control mean = 0.75 and confidence limits = 0.6 to 1.0 and CACC with crash mean = 1.1 and confidence limits = 0.9 to 1.5.
Note: Error bars represent estimated 95-percent confidence limits of the means.

Figure 14. Graph. Estimated mean time to from beginning of brake pedal depression to full braking.

 

TRUST IN THE CACC SYSTEM

About 6.8 min into the drive (the moment where period 1 ended and period 2 began), a simulated CACC vehicle merged into the gap between the participant’s vehicle and the car ahead, approximately halving the participant’s following gap distance. All participants were exposed to the merge event. One measure of trust in the system is whether the participants in the CACC conditions trusted the system to maintain speed/gap control or intervened by braking to increase the gap or by pressing the accelerator to return to a 1.1-s gap. Only 1 of 36 CACC participants braked during the merge event, and 1 participant pressed the accelerator pedal. By comparison, all of the control condition participants used the brake pedal during the merge event.

 

 

Federal Highway Administration | 1200 New Jersey Avenue, SE | Washington, DC 20590 | 202-366-4000
Turner-Fairbank Highway Research Center | 6300 Georgetown Pike | McLean, VA | 22101