U.S. Department of Transportation
Federal Highway Administration
1200 New Jersey Avenue, SE
Washington, DC 20590
202-366-4000


Skip to content
Facebook iconYouTube iconTwitter iconFlickr iconLinkedInInstagram

Federal Highway Administration Research and Technology
Coordinating, Developing, and Delivering Highway Transportation Innovations

 
REPORT
This report is an archived publication and may contain dated technical, contact, and link information
Back to Publication List        
Publication Number:  FHWA-HRT-11-056    Date:  October 2012
Publication Number: FHWA-HRT-11-056
Date: October 2012

 

Layered Object Recognition System for Pedestrian Sensing

6. EXPERIMENTS AND RESULTS

The proposed system consists of a stereo rig that is made of off-the-shelf monochrome cameras and a commercial stereo processing board that runs on multicore personal computer environments.(23) The cameras are of standard NTSC automotive grade with 720 × 480 image resolution with a 46-degree field of view. The stereo rig is mounted inside a vehicle (Toyota® Highlander) that also has a dual-quad-core processing unit and electronics to power the computer using the vehicle battery. This test platform allows the researchers to conduct live experiments and collect data for offline processing.

To evaluate system performance, the research team captured and ground truth-marked a number of data sequences in various urban driving scenarios. The testing data included sequences of pedestrians crossing the road, cluttered intersections, and pedestrians darting out from between parked vehicles. The research team also acquired data from publicly available datasets, which are particularly challenging because they have a large number of pedestrians and a crowded urban setting.(30) The research team compared the performance of this system against those of other state-of-the-art systems in the public dataset.(30)

Several image examples of the data collected are provided in figure 39 through figure 51.

This photo shows pedestrians crossing a signalized intersection in an urban environment during the day. There are many vehicles on the road, and the road has sidewalks.

Figure 39. Photo. Pedestrians crossing at an intersection during the day under good lighting conditions.

This photo shows pedestrians crossing the street at a signalized intersection in an urban environment. There is a vehicle waiting for the pedestrians to cross before turning right.

Figure 40. Photo. Pedestrians crossing at an intersection during the day while a vehicle turns right.

This photo shows a pedestrian crossing a street at an unsignalized crosswalk across the street from a vehicle in an urban environment.

Figure 41. Photo. Pedestrians crossing an intersection at night.

This photo shows an adult and two children crossing a road at midblock during the evening in front of a vehicle. There are stores on the side of the road as well as a parked vehicle.

Figure 42. Photo. Pedestrians crossing a road at midblock during the evening.

This photo shows people crossing a street at midblock in an urban environment in front of a vehicle. There are buildings on both sides of the street.

Figure 43. Photo. Pedestrians crossing a road at midblock during the early evening.

This photo shows pedestrians crossing a street at night in an urban environment at an unsignalized intersection in front of a vehicle. There are also pedestrians crossing the street away from the vehicle at midblock.

Figure 44. Photo. Pedestrians crossing a road at an intersection at night.

This photo shows a vehicle in the center lane of a three-lane highway.

Figure 45. Photo. Vehicle driving on the highway.

This photo shows two vehicles side-by-side on a two-lane highway.

Figure 46. Photo. Second view of vehicles driving on the highway with tall vertical poles and overhang bridge in the field of view.

This photo shows pedestrians in an urban environment crossing a multilane street at midblock. There are parked vehicles on both sides of the street.

Figure 47. Photo. Pedestrians crossing midblock in a multilane urban street with overhang bridge as overlapping background.

This photo shows a pedestrian crossing a snow-covered street at a signalized crosswalk in an urban environment while vehicles are waiting for the signal to change.
©INRIA (See Acknowledgements section)

Figure 48. Photo. Pedestrian crossing the street and right-turning vehicle in winter.

This photo shows people walking on the sidewalk on both sides of a street in an urban environment as a vehicle drives down the street.
©INRIA (See Acknowledgements section)

Figure 49. Photo. Pedestrians on the sidewalk in an urban environment during winter.

This photo shows people walking on a snow-covered street near a vehicle that is parked on the left side of the street.
©INRIA (See Acknowledgements section)

Figure 50. Photo. Pedestrians walking in the roadway near parked vehicles in an urban environment.

This photo shows pedestrians crossing the street at a signalized crosswalk in an urban environment during the day.
©INRIA (See Acknowledgements section)

Figure 51. Photo. Pedestrians at a crosswalk in front of a vehicle in bright conditions with saturated areas.

6.1 Evaluation Methodology

This section briefly discusses the experimental methodology and shows results on selected sequences. The system was evaluated by comparing it to hand-marked ground-truth marked data. For a detailed evaluation, the research team analyzed the performance under the following factors:

6.2 Experimental Results

The results in this section are presented for typical sequences acquired at the research team’s campus, from a Europe dataset, and from a publicly available dataset.(31, 12) Overall, the research team captured over 2 h of video data using a vehicle owned by the research team and a vehicle maintained by the research team’s automotive tier 1 partner, Autoliv Electronics.

The results are representative of the developed system’s performance. It is important to note that for many of these sequences, the FPPF results are somewhat misleading in that the sequences are acquired for the purposes of pedestrian detection and do not have the empty roads that are typical of regular driving scenarios.

For each of the tables shown below, the performance of each key module of the developed system is shown including the stereo-based PD and the detector and classifiers as well as the detector, classifier, and tracker. Results are shown for both in-path pedestrians and all pedestrians in the field of view up to 131.2 ft (40 m).

Additionally, the research team tested the real-time system by driving the vehicle and qualitatively observing true detection and FP performance. The system was tested while driving at speeds of 15 and 30 mi/h (24.15 and 48.3 km/h). Researchers also demonstrated the system multiple times to FHWA personnel at the research team’s campus in Princeton, NJ, and at the Turner-Fairbank Highway Research Center in McLean, VA. The following main observations were made during live experiments:

Tabulated results are provided in table 3 through table 9. Table 3 results are as follows:

Table 3. In-path detection results for sequence 080613111722_BM-SHJ_cross-in-front (parking lot).
Mode Detection Rate (percent) FPPF Number of People
Detector only 100 0.04 70
Detector + classifier 87.14 0 70
Detector + classifier + tracker 95.71 0 70



Table 4 results are as follows:

Table 4. Full field-of-view detection results for sequence 080613111722_BM-SHJ_cross-in-front (parking lot).
Mode Detection Rate (percent) FPPF Number of People
Detector only 100 7.09 383
Detector + classifier 87.73 0.36 383
Detector + classifier + tracker 96.87 1.1 383



Table 5 results are as follows:

Table 5. Full field-of-view detection results for sequence 80613112933_SHJ_walk_BM_stand_on-side (parking lot).
Mode Detection Rate (percent) FPPF Number of People
Detector-only 95.73 10.06 234
Detector + classifier 90.60 0.54 234
Detector + classifier + tracker 98.29 1.55 234



Table 6 results are as follows:

Table 6. Full field-of-view detection results for sequence EuropeTour_Innsbruck.0_20070128_42_SVS_Data.
Mode Detection Rate (percent) FPPF Number of People
Detector-only 90.54 4.436 134
Detector + classifier 72.97 0.58 134
Detector + classifier + tracker 85.14 1.36 134



Table 7 results are as follows:

Table 7. Full field-of-view detection results for sequence EuropeTour_Wurzburg.0_20070126_19_SVS_Data.
Mode Detection Rate (percent) FPPF Number of People
Detector-only 86.43 5.35 161
Detector + classifier 70.54 0.98 161
Detector + classifier + tracker 74.03 2.5 161



Table 8 results are as follows:

Table 8. In-path detection results for Sequence seq00_rerun (Ess sequence).
Mode Detection Rate (percent) FPPF Number of People
Detector only 94.56 0.82 584
Detector + classifier 66.61 0.16 584
Detector + classifier + tracker 92.81 0.45 584



Table 9 results are as follows:

Table 9. Full field-of-view detection results for sequence seq00_rerun (Ess sequence).
Mode Detection Rate (percent) FPPF Number of People
Detector only 91.91 10.78 1,816
Detector + classifier 66.13 1.56 1,816
Detector + classifier + tracker 89.21 3.55 1,816

6.2.1 Comparison Between FHWA Results and Published State-of-the-Art Results

Figure 52 through figure 55 show receiver operating characteristic (ROC) curves illustrating the developed system’s performance on four sequences (Seq00, Seq01, Seq02, and Seq03). The figures also show comparisons with another representative approach from literature.(31)

This graph shows a receiver operating characteristic (ROC) curve for public data sequence 00, which includes 499 frames and 1,581 annotations. Detection rate is on the y-axis from 0 to 100 percent in increments of 10 percent, and number of false positives per frame is on the x-axis from 0 to 3.5 in increments of 0.5. Two lines are shown on the graph: pedestrian detection performance without structural classification (SC) (solid red line) and pedestrian detection performance with SC (solid blue line). For the same given false positive, the algorithm with structure classification yielded a higher detection rate (better result). The ideal ROC curve is one that reaches the detection rate close to 1 but still has a very low false positive rate and then levels off almost horizontally.

Figure 52. Graph. ROC curves for Seq00.

This graph shows an receiver operating characteristic (ROC) curve for public data sequence 01, which includes 1,000 frames and 5,207 annotations. Detection rate is on the y-axis from 0 to 100 percent in increments of 10 percent, and number of false positives per frame is on the x-axis from 0 to 3.5 in increments of 0.5. Four lines are shown on the graph: pedestrian detection performance without structural classification (SC) (solid red line), pedestrian detection performance with SC (solid blue line), Andreas Ess1 from literature (dotted pink line), and Andreas Ess2 (dotted blue line).

Figure 53. Graph. ROC curves for Seq01.

This graph shows an receiver operating characteristic (ROC) curve for public data sequence 02, which includes 451 frames and 1,731 annotations. Detection rate is on the y-axis from 0 to 100 percent in increments of 10 percent, and number of false positives per frame is on the x-axis from 0 to 3.5 in increments of 0.5. Three lines are shown on the graph: pedestrian detection performance without structural classification (SC) (solid red line), pedestrian detection performance with SC (solid blue line), and Andreas Ess1 from literature (dotted pink line).

Figure 54. Graph. ROC curves for Seq02.

This graph shows an receiver operating characteristic (ROC) curve for public data sequence 03, which includes 354 frames and 1,724 annotations. Detection rate is on the y-axis from 0 to 100 percent in increments of 10 percent, and number of false positives per frame is on the x-axis from 0 to 3.5 in increments of 0.5. Three lines are shown on the graph: pedestrian detection performance without structural classification (SC) (solid red line), pedestrian detection performance with SC (solid blue line), and Andreas Ess1 from literature (dotted pink line).

Figure 55. Graph. ROC curves for Seq03.

Example image outputs of the system are provided in figure 56 through figure 65. In the left image in figure 56, magenta and green pixels are detected by the SC as a tall, vertical structure. The image on the right shows the detections that were rejected by the SC in blue. The red rectangles indicate objects that were identified as pedestrians.

This figure shows two photos from the structure classifier (SC) in an alleyway. In the left image, magenta and green pixels are detected by the structure classifier as tall vertical structures. Blue pixels are detected as people and vehicles. The image on the right shows the detections that were rejected by the structure classifier in blue bounding boxes.
©INRIA (See Acknowledgements section)

Figure 56. Photo. Sample output from SC in an alleyway.

In the image on the left in figure 57, ground pixels are yellow, overhang/tree branch pixels are green, and buildings/tall vertical structure pixels are magenta. Blue pixels indicate regions containing objects that will be further processed by an appearance classifier. In the right image, blue boxes indicate objects rejected by the SC, and white boxes indicate potential pedestrians.

This figure shows two photos from the structure classifier (SC) in a dense urban scene with pedestrians in the path of the vehicle. In the left image, ground pixels are yellow, overhang/tree-branch pixels are green, and buildings/tall vertical structure pixels are magenta. Blue pixels indicate regions containing objects that will be further processed by an appearance classifier. In the right image, blue bounding boxes indicate objects rejected by the structure classifier, while white bounding boxes denote potential pedestrians.
©INRIA (See Acknowledgements section)

Figure 57. Photo. Sample output from SC in a dense urban scene with pedestrians in the vehicle path.

The image on the left in figure 58 shows ground pixels in yellow and tall vertical structure pixels in magenta and green. Pedestrian candidate regions are blue. In the right image, white boxes indicate detected pedestrian candidates, and blue boxes indicate rejected candidates.

This figure shows two photos from the structure classifier (SC) in an urban environment with pedestrians at varying distances from the vehicle. The left image shows ground pixels in yellow and tall vertical structures in magenta and green. Pedestrian candidate regions are blue. The right image shows the detected pedestrian candidates in white bounding boxes and rejected candidates in blue bounding boxes.
©INRIA (See Acknowledgements section)

Figure 58. Photo. Sample output from SC in an urban scene with pedestrians at varying distances from the vehicle.

In the image on the left in figure 59, the SC correctly rejects the poles and trees in the foreground, which are magenta and green. It also rejects portions of the bicycle parked near the sidewalk while validating the pedestrian detections. In the image on the right, pedestrian detections are shown in white boxes, while rejected candidates have blue boxes around them.

This figure shows two photos from the structure classifier (SC) in an urban environment with pedestrians entering a building and others in the distance ahead of the vehicle. The left image shows ground pixels in yellow and tall vertical structures in magenta and green. In the right image, pedestrian detections are shown in white bounding boxes, while rejected candidates have blue bounding boxes.
©INRIA (See Acknowledgements section)

Figure 59. Photo. Sample output from SC in an urban scene with pedestrians entering a building and others in the distance ahead of the vehicle.

In figure 60, the SC did not reject the person on the motorcycle or the light post on the median. The image shows tall vertical structures in magenta, overhanging structures in green, and possible pedestrians in blue.

This photo shows a structure classifier (SC) in an urban environment. Tall vertical structures are magenta and green, while pedestrian candidates are blue. The classifier does not reject the person on the motorcycle or the light post on the median.
©INRIA (See Acknowledgements section)

Figure 60. Photo. SC rejecting poles.

Figure 61 through figure 65 show pedestrians detected by the appearance classifier, which are shown by the red boxes.

This photo shows a pedestrian walking across a crosswalk while a vehicle waits makes a right turn. The pedestrian is detected by the appearance classifier by a red bounding box.
©INRIA (See Acknowledgements section)

Figure 61. Photo. Appearance classifier recognizing a pedestrian.(12)

This photo shows two pedestrians in a parking lot. The appearance classifier detects them crossing in front of the vehicle and by parked vehicles by red bounding boxes.
©INRIA (See Acknowledgements section)

Figure 62. Photo. Appearance classifier output recognizing pedestrians crossing in front of vehicles.

This photo shows several pedestrians close to buildings. They are detected by the appearance classifier, as seen by red bounding boxes.
©INRIA (See Acknowledgements section)

Figure 63. Photo. Appearance classifier output recognizing pedestrians while making a left turn.

This photo shows pedestrians walking on a sidewalk next to a building ahead of a vehicle. They are detected by the appearance classifier by red bounding boxes.
©INRIA (See Acknowledgements section)

Figure 64. Photo. Appearance classifier recognizing pedestrians in front of a vehicle in a busy urban street.

This photo shows pedestrians walking on a sidewalk in an urban environment 98.4 ft (30 m) ahead of a vehicle. They are detected by the appearance classifier by red bounding boxes.
©INRIA (See Acknowledgements section)

Figure 65. Photo. Appearance classifier recognizing pedestrians 98.4 ft (30 m) ahead of a vehicle in a busy street.

 

Federal Highway Administration | 1200 New Jersey Avenue, SE | Washington, DC 20590 | 202-366-4000
Turner-Fairbank Highway Research Center | 6300 Georgetown Pike | McLean, VA | 22101