U.S. Department of Transportation
Federal Highway Administration
1200 New Jersey Avenue, SE
Washington, DC 20590
202-366-4000


Skip to content
Facebook iconYouTube iconTwitter iconFlickr iconLinkedInInstagram

Policy and Governmental Affairs
Office of Highway Policy Information

FHWA Home / Policy & Governmental Affairs / Highway Policy Information / Post-event Connected Vehicle Data Exploration - Lessons Learned

Post-event Connected Vehicle Data Exploration - Lessons Learned

Table of Contents

3. CV Data and Size

CV data emerges from vehicles communicating with each other and infrastructure, enhancing travel safety and operational efficiency by exchanging safety and mobility information. CV data encompasses vehicle and infra

CV data poses unique challenges due to its size. The three JPO Pilot projects (Tampa Hillsborough Expressway Authority, I-80 in Wyoming, and New York City) generate over 33 GB per day. Florida's four OEM CV vehicles, representing less than 10% of its registered 19 million vehicles, generate over 60 GB of data in a single day with more than a billion data records. Extrapolating, a year's coverage in Florida could produce data over 21 terabytes.

For refence and comparison purpose, FHWA's National Bridge Inventory (NBI) annual data is approximately 1.5 GB and the Highway Performance Monitoring System (HPMS) annual data is only about 4 GB.

CV data demands substantial storage and bandwidth for transmission.
Decision makers need to be aware of the data size challenge.

CV Data Variables

Understanding the CV data variables is crucial as these variables determine the extractable information. The OEM CV data comprises approximately 60 variables, split into vehicle movement and driving event datasets. The vehicle movement data set has information on the journey of a vehicle’s movement with geolocations and speed, heading, and ignition state with timestamps. The driving event dataset records various events along with corresponding geolocations, speeds, headings, and timestamps.

The following is a list of vehicle movement data variable names:

  • journey_id
  • datapoint_id
  • captured_time_local
  • captured_time_utc
  • local_captured_date
  • geohash
  • latitude
  • longitude
  • state_code
  • postal_code
  • country_code
  • heading
  • speed
  • ignition_state

The following is a list of driving event data variable names:

  • acceleration_type
  • anti_lock_braking_system_status
  • autonomous_emergency_braking_
    type
  • captured_time_local
  • captured_time_utc
  • country_code
  • datapoint_id
  • door_identifier
  • door_status_change_type
  • electronic_stability_status
  • event_type
  • exterior_temperature
  • fuel_consumption
  • fuel_level
  • geohash
  • heading
  • ignition_state
  • journey_event_type
  • journey_id
  • lateral_acceleration
  • latitude
  • light_identifier
  • light_state_change_type
  • local_captured_date
  • longitudal_acceleration
  • longitude
  • odometer
  • parking_brake_identifier
  • parking_brake_status_change_type
  • postal_code
  • seat_identifier
  • seat_occupancy_status
  • seatbelt_status
  • seatbelt_warning_status_change_
    type
  • signal_identifier
  • signal_state_change_type
  • speed
  • speed_threshold_change_type
  • state_code
  • wiper_identifier
  • wiper_interval
  • wiper_state_change_type

CV data demands substantial storage and bandwidth for transmission.
Decision makers need to be aware of the data size challenge.

The variable event_type has the following critical choices:

Acceleration_Change, Anti_Lock_Braking_System_State_Change, Autonomous_Emergency_Braking_Change, Door_State_Change, Electronic_Stability_State_Change, Journey, Light_State_Change, Parking_Brake_State_Change, Seat_Belt_Change, Seat_Belt_Warning, Seat_Occupancy_Change, Signal_State_Change, Wiper_State_Change, Speed_Threshold_Change

All event data include their corresponding geolocations, speeds, headings, fuel levels, odometer readings, and timestamps when events occurred.

The JPO Pilot CV data, obtained through custom-installed safety devices (On-Board Unit) on pilot vehicles and roadside units (RSU), include Basic Safety Messages (BSM), Traveler Information Messages (TIM), Signal Phase and Timing (SPaT), and EVENT categories.

The pilot data are organized into four key category files: the basic safety message (BSM), traveler information messages (TIM), the signal phase and timing (SPaT), and EVENT. BSM contains each acceleration/deceleration point in 3 axes (longitudinal, lateral, and vertical acceleration) plus the yaw rate. TIM contains roadside sign information. EVENT is a log collection of BSM, TIM and SPaT with time and location obfuscated. This CV pilot data analysis focuses on BSM and TIM.

While the JPO CV data published follow the SAE International Surface Vehicle Standard J2540, J2735 and J2945 specifications, there are significant differences between these data files regarding data variables and the exact meaning of such variables. Data variables, format, structures, data availability and units vary between pilot sites or even between different RSUs. Also, unlike the commonly used relational tabular data model, the JPO pilot post-CV data takes a hybrid approach to represent and save its complex data variables. At the table level, it saves each data point as a regular record. At the record level however, data are saved in hierarchical manner as the JSON format. Both the BSM and TIM contain many data variables (over 60). Certain variables such as PathHistoryPoint of BSM and SEQUENCE_item_itis of TIM are array type, containing multiple entries.

The following is a list of BSM data variables:

  • RSUID
  • recordGeneratedBy
  • recordGeneratedAt
  • coreData_id
  • coreData_secMark
  • coreData_lat
  • coreData_long
  • coreData_elev
  • coreData_speed
  • coreData_heading
  • coreData_angle
  • coreData_accelSet_long
  • coreData_accelSet_lat
  • coreData_accelSet_vert
  • coreData_accelSet_yaw
  • PathHistoryPoint_latOffset
  • PathHistoryPoint_lonOffset
  • PathHistoryPoint_elevationOffset
  • PathHistoryPoint_timeOffset

The following is a list of TIM data variables:

  • RSUID
  • recordGeneratedBy
  • recordGeneratedAt
  • TravelerDataFrame_frameType_
    roadSignage
  • TravelerDataFrame_msgId
    _roadSignId_position_lat
  • TravelerDataFrame_msgId_
    roadSignId_position_long
  • TravelerDataFrame_msgId_
    roadSignId_position_elevation
  • TravelerDataFrame_msgId_
    roadSignId_viewAngle
  • TravelerDataFrame_msgId_
    roadSignId_mutcdCode
  • TravelerDataFrame_content_speedLimit_
    SEQUENCE_item_itis
  • TravelerDataFrame_content_advisory_
    SEQUENCE_item_itis

These BSM and TIM data variables listed above are only the key variables used during the analyses. The names shown are simplified from their hierarchical levels. For more information about this pilot program and CV data, readers may visit https://datahub.transportation.gov/stories/s/Connected-Vehicle-Pilot-CVP-Open-Data/hr8h-ufhq/.

 

Previous | Next

Page last modified on May 8, 2024
Federal Highway Administration | 1200 New Jersey Avenue, SE | Washington, DC 20590 | 202-366-4000