Featuring developments in Federal highway policies, programs, and research and technology.
|This magazine is an archived publication and may contain dated technical, contact, and link information.|
|Federal Highway Administration > Publications > Public Roads > Vol. 73 · No. 2 > Detecting Pedestrians|
Publication Number: FHWA-HRT-09-006
by David R.P. Gibson, Bao Lang, Bo Ling, Uma Venkataraman, and James Yang
With support from FHWA, researchers are integrating the latest camera technology with traffic control to improve safety at intersections.
In 2007, more than 4,600 pedestrians died in traffic crashes in the United States, according to the National Highway Traffic Safety Administration (NHTSA). That same year, crashes injured about 70,000 pedestrians. Zeroing in on intersections, NHTSA reports that 984 pedestrians were killed and 31,000 injured in 2005. Although these figures are lower than in previous years, the statistics underscore the continuing need for safety improvements. For instance, children 14 years old and younger accounted for 20 percent of all pedestrian injuries and 7 percent of all pedestrian fatalities. For NHTSA and the Federal Highway Administration (FHWA), even one fatality or injury is one too many.
As part of the U.S. Department of Transportation's Intelligent Transportation Systems (ITS) program, FHWA is conducting research and development of vehicle safety and driver information systems. For many systems and applications—such as IntelliDriveSM, traffic control, security monitoring, and pedestrian counting and flow analysis—pedestrian monitoring could add value. Specifically, monitoring can help avoid potential harm to pedestrians when collision avoidance measures or emergency vehicle preemptions are imposed when pedestrians are present. And, pedestrian monitoring can help reduce delays, minimize fuel consumption, and limit vehicle emissions by facilitating traffic control optimization when pedestrians are absent.
Even after decades of research, pedestrian detection at street intersections remains a challenge. Despite the variety of existing technologies, including microwave radar; video image processing; and ultrasonic, acoustic, passive infrared (IR), active IR, piezoelectric, and magnetic sensors, these approaches have yet to excel in detecting pedestrians in real-world applications. The limitations of pedestrian sensors are largely due to the highly dynamic backgrounds typical of intersections. Variable weather and illumination conditions, for example, make it difficult to design system features and templates suitable for all situations. The high false alarm rate—that is, detecting pedestrians who are not really there—associated with these technologies has kept traffic engineers from deploying them on a widespread basis.
But the tide might be turning. Using funding available through the FHWA Small Business Innovation Research (SBIR) program, researchers have developed a new stereo vision-based approach for detecting pedestrians at intersections. The technique involves a prototype of a new IR, light-emitting diode (LED) stereo camera that can detect pedestrians both during the day and at night. The researchers also developed advanced pedestrian detection algorithms that enable them to extract generic three-dimensional (3-D) features from a stereo disparity map, leaving the human figures behind. The technology can discriminate pedestrians from vehicles because automobiles appear basically flat, while human bodies have concave shapes.
With support from the Massachusetts Highway Department (MassHighway), the researchers installed the prototype camera system at the busy State highway intersection of Route 9 and Route 47 in the town of Hadley, MA, for testing over a 3-week period. The results from this pilot test indicate that the prototype is on track toward being ready for commercial sale and widespread use.
"The pedestrian detection application using computerized stereo vision has great potential for improving pedestrian safety," says Subramanian N. Sharma, chief of engineering and research at the New Hampshire Department of Transportation's (NHDOT) Bureau of Traffic, who has been monitoring the progress of this research.
Intelligent Traffic Signal Management
At signalized intersections and midblock crosswalks, pedestrians use pushbuttons to make service requests, that is, to request the WALK signal. Once a request is granted, pedestrians can safely cross the street. But in many cases, after pressing the button, pedestrians do not wait for the signal but instead cross the street when they see a break in the traffic flow. When the crosswalk signal finally turns to WALK, pedestrians might no longer be in the crosswalk, and vehicles end up needlessly stopped. When pedestrians do wait for the WALK signal, they might cross the street quickly, also leaving stopped vehicles idling for no reason.
A reverse situation also can occur. The service time for crosswalk signals often is fixed. However, slow-moving pedestrians such as children or senior citizens might need more time to cross a street than was preconfigured in the system. In these cases, the crosswalk service time should be extended to improve safety.
Both applications—reducing and extending crosswalk times—require a pedestrian-monitoring device at the intersection, such as the one the researchers developed and tested during this study to develop a new, robust system to detect and track multiple pedestrians. When the device detects pedestrians in the crosswalk, it sends a signal to the traffic signal controller, which in turn extends the pedestrian walk phase.
According to the manual for the Econolite ASC/3 traffic signal controller used for this research project, once the normal walk time is set, if a pedestrian is detected, the walk time will be extended until (1) the maximum walk time is reached, (2) the elapsed length of the walk extension plus the pedestrian clear time equal the maximum in effect, or (3) the detector input goes to false, meaning there are no pedestrians in the crosswalk. As long as pedestrians occupy the crosswalk, the traffic signals remain red on the potentially conflicting approaches designated by the traffic engineer. When the pedestrians no longer occupy the crosswalk for a short time, the device sends another signal to the traffic controller, which can then change the signal phase to the next appropriate phase for vehicular traffic.
This approach, a type of intelligent traffic signal transition logic, offers many benefits for a community. For example, it can help senior citizens and others safely negotiate the crosswalk. Intelligent pedestrian detection can reduce traffic congestion at street intersections, ease driver frustration, and reduce idling and associated fuel consumption and pollution. By contrast, many of today's intersection traffic control systems are necessarily designed for worst-case pedestrian crossing time scenarios and are not "intelligent" in the sense of being responsive to real-time situations. For example, the researchers had to redesign signal timing on an arterial to accommodate elderly pedestrians who could not safely cross the street in the allotted time, which was originally designed to meet the requirements of the Manual on Uniform Traffic Control Devices (MUTCD), which called for allowing a walk speed of 4 feet (1.2 meters) per second. Note that FHWA expects to change this standard to 3.5 feet (1.07 meters) per second in the future.
Software Components for Detection System
For automated pedestrian detection, traffic managers can employ stereo camera systems (twin cameras), wireless receivers, software algorithms, and stand-alone computers. The key issue in image capturing is sampling time. The most important issue for the algorithms is computation time. The cameras capture a new pair of samples once the computer and algorithms have completely processed the pair captured in the previous sampling. For pedestrians at street intersections, a system must capture two or three detections in 1 second to ensure proper detection.
Another approach is image differencing. Here, the same camera (right or left) takes two consecutive images, and a "difference image" is made by subtracting the two images. Image differencing is particularly suitable for detecting moving targets. In theory, stationary objects in the image background (such as buildings and streets) appear in both consecutive images, and computers can easily remove them.
However, weather and illumination changes can generate "moving objects" and shadows that cause false detections. Using image differencing alone can generate a high number of false detections. For example, on a rainy day, image differencing can produce continuous detections even when no pedestrians are in the crosswalk. But the new stereo-based approach can further discriminate the moving objects (whether true or false) using 3-D features and thus resolve the problem in practice.
In reality, however, the differencing image may retain part of the stationary objects. This phenomenon is caused by camera "noise," which is mainly related to the quality of the image-sensing device, such as the charge-coupled device used in the study. Camera jiggling and illumination changes also can cause retention of stationary objects. Camera jiggling is unavoidable. Because the stereo camera system is mounted outdoors at the street intersection, it jiggles in strong winds and other conditions. Illumination changes can alter the pixel values of two consecutive images randomly, making the still objects appear in the difference image. Noise in digital cameras is hardware-dependent (high-quality cameras are less noisy because of the electronics used) and can never be completely filtered out. Applying advanced image-filtering algorithms will not remove all stationary objects from the difference image. To solve this problem, researchers have developed an advanced image-filtering method that significantly reduces camera noise.
Compared to the image-differencing approach, detection using a single image offers some advantages. Because detection is made based on a single image, it is almost immune to camera jiggling and illumination changes. However, it is difficult to use a single camera or monocamera system to reliably detect pedestrians in an outdoor environment, such as at street intersections, largely because of dynamic variations of background and pedestrian appearance.
In the stereo camera approach, the systems usually detect pedestrians mainly based on 3-D features extracted from a disparity map (which provides the depth information of objects in an image), which can be time consuming for the system. The calculation of a full-size disparity map usually takes a few seconds, making it impossible for the system to capture two or three samples per second. Further, because a pedestrian usually occupies only a small area in the image, most disparity information in a full-size map is not even used for pedestrian detection. For example, buildings and light poles can have their own disparity maps as well. Because detecting these stationary objects is unnecessary, the stereo camera system can exclude them from the calculation of the overall disparity map, thus reducing computation time. Otherwise, the system would be unable to reach three or four detections per second using a low-cost industrial PC, which often has a slow central processing unit and limited memory.
To overcome problems associated with some difference images, the researchers developed a suite of approaches to filter out glitches generated by camera noise, camera jiggling, and illumination changes. To window out moving pedestrians—that is, detect and extract them from an image—the researchers developed a new method to estimate the baseline, or threshold, noise characteristics of the camera. The pedestrian detection system adaptively estimates the threshold value (image pixels for which intensity values are larger than this threshold value are retained; otherwise, they are treated as noises and removed) from the image content, independent of the camera characteristics.
This method is essential in accurately chipping out moving objects since camera noise is not the only source causing the imperfect difference image. This method can be applied to various types of cameras. The pedestrian detection system applies the method simultaneously to right and left images acquired from the stereo camera system. Moving pedestrians are windowed out from both images as well. By windowing out the pedestrians, the system can detect when a pedestrian is in the crosswalk and delay the signal change to allow more time for the person to cross the intersection safely.
Disparity Map Estimation
One of the main advantages of using a stereo vision system is to relate the distance between object and camera, and the disparity obtained from two images taken by the right and left cameras. Researchers have studied disparity estimation for years, yet it remains a hurdle for those in the computer vision community. The main challenges are noises and occlusions (or blockages) in the image, fewer distinguishing textures in the search region, and depth discontinuity.
To accomplish the goal of capturing three or four detections per second, the researchers needed to avoid developing a computation-intensive estimation scheme for the disparity maps. Many academic methods, such as phase-based matching, Markov random field modeling, and dynamic programming, will not work quickly enough for near real-time pedestrian detection applications. However, the researchers' new approach offers an efficient method for estimating the disparity maps that satisfies the time constraint. In short, the researchers only estimate the disparity values for the objects windowed out and refine the disparity values using spatial correlations.
Once the system estimates the disparity map, it can extract features from the map. Ideally, the disparity map of a pedestrian has the corresponding parts of the pedestrian, such as head, upper body, and legs. However, due to camera noise, camera jiggling, and illumination changes, the disparity map of a pedestrian is often disconnected or sometimes incomplete. Therefore, the system has difficulty extracting human body shapes from the map. Also, range information alone is not sufficient to discriminate between a moving vehicle and a walking pedestrian.
To overcome this challenge, the researchers developed a set of 3-D features from the disparity map. The features reflect the geometric differences between moving pedestrians and moving vehicles or other images caused by camera jiggling or illumination changes. However, disparity alone is not enough to accomplish the goal of near zero false detections. In addition to including 3-D disparity maps, the researchers also designed the system to extract features, which can be used to categorize 3-D objects as either pedestrians or nonpedestrian, from color images and use them in pedestrian detection and discrimination. Simply put, the color of a vehicle body, for example, often is uniformly distributed, while a person's clothes tend to have mixed colors, thus facilitating distinction between vehicles and pedestrians.
IR LED Stereo Camera
The IR LED stereo camera used in this study consists of two stand-alone cameras paired together. The stereo system captures images during all illumination conditions. The cameras provide high-resolution color images during the day and, gray-scale pictures in low-light conditions such as evening and night. The cameras use a high-resolution, 0.33-inch (0.85- centimeter) color, charge-coupled device and operate at 5.8 gigahertz. One hundred LED emitters in each camera make it possible for the system to detect pedestrians 80-100 feet (24-30 meters) away in total darkness.
To construct the stereo camera, the researchers positioned the two IR LED cameras side by side, such that the two focal rays of the lenses are parallel and perpendicular to the stereo baseline, and the image planes of both lenses are colinear. This arrangement ensures that the system can estimate the disparity map accurately.
Field Trial at a Highway Intersection
With assistance from MassHighway District 2, the researchers installed the prototype at a State highway intersection in Hadley, MA, for the 3-week field trial. Workers placed a mini-PC hosting all the detection algorithms inside a nearby traffic signal controller cabinet. The system configuration for this field trial included two wireless receivers and a wireless air card that received images from the stereo camera and provided wireless Internet access. Because the metal traffic controller cabinet blocks the wireless signals, the researchers placed the receivers and wireless card in a wooden box mounted beside the cabinet. They connected a separate underground power cable to the stereo camera through the inside of the traffic light pole, essentially invisible to the public.
The researchers designed the system to recover itself from power and Internet communication failures. They implemented a mechanism to track the operation status of the mini-PC, whereby the system sent a "heartbeat image" to a Web server every 5 minutes. Each heartbeat image was a snapshot of the crosswalk at the moment it was sent. From this image, the researchers could determine whether the system was operating properly, detect interference between the wireless camera and its receiver, decide whether the wireless card was working properly, and assess weather conditions at the test site.
During the field trial, one camera in the stereo camera system recorded an image every 2 seconds during four predetermined, 2-hour periods: period A, 7-9 a.m.; period B, 11 a.m.-1 p.m.; period C, 4-6 p.m.; and period D, 8-10 p.m. The system stored these recorded images on the mini-PC's hard drive. The researchers then used the images to estimate the positive detection rate and missing detection rate. These four periods of time represent three rush hours and one night period. From the images recorded during these intervals, the researchers manually scanned through the images to identify pedestrians. Then, they checked the detections, which were also recorded. If the pedestrians present in the recorded images during the four time periods were among the images with detected pedestrians at the same time, the researchers concluded that the pedestrian detection was accurate. Otherwise, pedestrians were missed.
Evaluation of Prototype Performance
Because the study did not account for the total number of vehicles passing through the test site during the field trial, the researchers expressed the false detection rate as the number of false detections (that is, vehicles) per minute. The two intervals with the most missing detections were 8-9 a.m. and 4-5 p.m., which coincided with the local rush hours.
Other time intervals had no missing detection information because there were no independent recordings during these time intervals. One way to estimate the missing detections in these time intervals was to infer the missing detections from the actual positive detections. The researchers assumed that the number of missing detections was approximately proportional to the number of actual detections during the same period. They used this rule to infer the missing detections in the time intervals in which there were no recorded images.
Source: Migma Systems, Inc.
Using the number of missing detections (actual and estimated), the researchers estimated the rate of missed pedestrian detection as one pedestrian every 15 hours, between 6 a.m. and 9 p.m. Because few if any pedestrians crossed the intersection after 9 p.m. and before 6 a.m., the researchers concluded that the pedestrian missing rate is approximately one per day.
Source: Migma Systems, Inc.
The researchers summarized the overall performance of the prototype as follows: false alarm rate = one per 60-70 minutes; pedestrian missing rate = one per day.
The system detected many pedestrians at the intersection, individuals and groups, during the field trial. For example, the system detected a pedestrian walking with a cane. Detection of slow-moving people is important because they might require extended service times to walk across the street safely.
The system also detected children with bicycles crossing the intersection. This ability is extremely important for protecting the safety of children. Although pedestrians generally are difficult to detect at night using regular video cameras, the IR LED stereo camera developed by the researchers was able to detect pedestrians in the dark. And, in early morning and late afternoon, pedestrians often cast long shadows that make it difficult to detect them accurately. But, again, the researchers' system overcame these challenges.
With the field trial completed, the third phase of the project, product development, is now underway. In the current prototype, all detection algorithms are executed by the mini-PC placed inside the traffic signal controller cabinet. In the final product offering, the IR LED stereo camera will host the detection algorithms. Moreover, the smart stereo camera will support Wi-Fi wireless communication, making it an Internet protocol (IP) IR LED stereo camera that overcomes the difficulties of disparity estimation caused by IP packet random delays. (The commercial IP camera has a random delay, so it is impossible to acquire two images simultaneously using two stand-alone commercial IP cameras. Therefore, the disparity map cannot be estimated using these IP cameras. In this study, the researchers acquired the images from two IR LED cameras to avoid the random delays. Detection results are transmitted wirelessly using TCP/IP.)
Phase three funding also will facilitate product prototyping and manufacturing. FHWA will carefully balance the mechanical design of the product and its cost to make the final product affordable for widespread deployment. The researchers expect that the final product would cost less than $3,000.
And what of the ultimate, long-term value of this research? Says Dan Stewart, manager of the bicycle and pedestrian program at the Maine Department of Transportation, "This initiative has the strong potential to improve pedestrian safety and reduce injuries and deaths, as well as improve traffic flow."
David R.P. Gibson is a highway research engineer on the Enabling Technologies Team in FHWA's Office of Operations Research and Development. He is a registered professional traffic engineer with bachelor's and master's degrees in civil engineering from Virginia Polytechnic Institute and State University. His areas of interest include traffic sensor technology, traffic control hardware, traffic modeling, and traffic engineering education.
Bao Lang is the district traffic engineer for MassHighway. He is a registered professional engineer with a bachelor's degree in civil engineering from the University of Massachusetts.
Bo Ling received his M.S. in applied mathematics and Ph.D. in electrical engineering from Michigan State University. He has served as a principle investigator for numerous government-funded research projects. He is a cofounder and president and CEO of Migma Systems, Inc. Ling is a senior member of the Institute of Electrical and Electronics Engineers, Inc. and a part-time faculty member of Northeastern University's Department of Electrical and Computer Engineering. His research interests include applying advanced signal processing algorithms to sensing applications.
Uma Venkataraman has an M.S. in computer science from India's University of Madras and more than 8 years of software development experience. She joined Migma Systems as a senior software engineer in 2003. She focuses on application software development, with research interest in the application of ontological models to enable data sharing and system interoperation among disparate systems.
James Yang received his B.S. in mechanical engineering and B.S. in computer science from Southeast University in China, and his master's in computer information systems from Boston University. He joined Migma Systems as a network engineer in 2005.
Acknowledgement: The technology described here was developed under USDOT SBIR Phase I funding (Contract No. DTRT57-05-C-10105) and Phase II funding (Contract No. DTRT57-06-C-10030).
Page Owner: Office of Corporate Research, Technology, and Innovation Management
Scheduled Update: Archive - No Update
Technical Issues: TFHRC.WebMaster@dot.gov