U.S. Department of Transportation
Federal Highway Administration
1200 New Jersey Avenue, SE
Washington, DC 20590

Skip to content
Facebook iconYouTube iconTwitter iconFlickr iconLinkedInInstagram

Federal Highway Administration Research and Technology
Coordinating, Developing, and Delivering Highway Transportation Innovations

This fact sheet is an archived publication and may contain dated technical, contact, and link information
Back to Publication List        
Publication Number:  FHWA-HRT-17-117    Date:  February 2018
Publication Number: FHWA-HRT-17-117
Date: February 2018


Logo. The Exploratory Advanced Research Program's logo of a satellite over a highway-representing operating systems and reducing congestion.

The Exploratory Advanced Research Program

Knowledge Discovery in Massive Transportation Datasets


Merging Information from Disparate Sources to Enhance Traffic Safety

Exploratory Advanced Research - Next Generation Transportation Solutions

PDF Version (428 KB)

PDF files can be viewed with the Acrobat® Reader®


Photograph. Vehicles travel down a wet highway in both directions.
© Gwoeii/Shutterstock.com

Broad adoption of engineering and policy advances, including air bags, highway safety barriers, and distracted driving laws, contributes to increased vehicle safety on our Nation’s roadways. Although the number of deaths and injuries from crashes decreased slightly in 2017, the number of 2017 deaths is still at a level not seen since 2007 (National Safety Council 2017). One promising avenue for reducing crashes lies in extracting and analyzing safety-related information from vast and expanding datasets related to driver behavior, vehicle performance, traffic patterns, weather, and infrastructure characteristics. Identifying and making sense of this information will require new techniques. The Federal Highway Administration (FHWA) Exploratory Advanced Research (EAR) Program is supporting research projects that can process massive amounts of transportation-related data from structured, semistructured, and unstructured datasets using open-source tools and technology. The Palo Alto Research Center, Inc. (PARC) is developing automated methods to integrate information from large unrelated datasets. CUBRC, a Buffalo, New York-based systems integration research organization, is developing a layered infrastructure to ingest, store, analyze, and display information.

Acquiring and Compiling Big Data for Traffic Safety

For decades, traffic safety researchers have developed and expanded datasets that describe human behavior, vehicle conditions, and other contextual information related to highway crashes. FHWA’s Highway Safety Information System (HSIS) compiles quality data on accident, roadway, and traffic variables collected by States for managing highway systems and studying safety. The second Strategic Highway Research Program (SHRP2) includes a naturalistic driving study (NDS), a resource that includes trip summary records describing more than 3,400 drivers and vehicles involved in roughly 36,000 baseline driving events, including crashes and near-crashes. A related SHRP2 Roadway Information Database (RID) contains detailed information about NDS trips on the most frequently traveled roadway sections, including roadway curvature, number and type of lanes, intersections, guardrails and barriers, and lighting.

Other sources of information, such as Clarus roadway-weather data and video logs that capture roadway features and characteristics, can provide important data related to traffic safety. “Merging traffic-related information from disparate sources makes it possible to detect safety issues that might not be identified by looking at traditional datasets only,” says Ana Maria Eigen of FHWA’s Office of Safety Research and Development. EAR Program-supported researchers at PARC are developing automated machine learning methods that will replace slower manual methods to extract, clean, and restructure data. PARC is using video, radar, and still photography information gathered at Chicago intersections. Tools developed through this project will be refined for use with similar data-rich traffic information resources.

FHWA-HRT-17-117 PDF Cover Image



Federal Highway Administration | 1200 New Jersey Avenue, SE | Washington, DC 20590 | 202-366-4000
Turner-Fairbank Highway Research Center | 6300 Georgetown Pike | McLean, VA | 22101