U.S. Department of Transportation
Federal Highway Administration
1200 New Jersey Avenue, SE
Washington, DC 20590
202-366-4000


Skip to content
Facebook iconYouTube iconTwitter iconFlickr iconLinkedInInstagram

Federal Highway Administration Research and Technology
Coordinating, Developing, and Delivering Highway Transportation Innovations

 
REPORT
This report is an archived publication and may contain dated technical, contact, and link information
Back to Publication List        
Publication Number:  FHWA-HRT-13-026    Date:  March 2014
Publication Number: FHWA-HRT-13-026
Date: March 2014

 

Guidance on The Level of Effort Required to Conduct Traffic Analysis Using Microsimulation

CHAPTER 4. DATA COLLECTION

The true validity of any model is dependent on the quality of the data that goes into it. The data requirements should be evaluated in the early inception stages of a project to get an early estimate of effort required. Quality/variability of existing data will impact sample sizes; therefore, early statistical evaluations of the data (margin of error) can prove highly valuable for subsequent development of effort required. Key calibration performance measures should be identified early to help determine the data needed to estimate these performance measures.

Often, it is possible to use existing data for data collection, but these data may be outdated or from different timeframes for different parts of the network. In that case, resources need to be allocated for new data collection. If there is limited funding, resources need to be spent judiciously to collect sufficient quality data. Table 7 lists example data required for simulation model calibration.

It also is important to plan for documentation of the information. The documentation methods should include a clear file-naming structure, an explanation of the sources of data, and a database structure that can be readily incorporated into model inputs and model reports. Efficiency of managing the data can be further enhanced through the development of automated procedures.

As part of the scope development process, a data plan needs to be developed. The data collection plan includes sources and types of data that will be needed to prepare and calibrate the model.

Table 7. Example data requirements for simulation.

Physical Geometry Traffic Control Travel Demand Intelligent Transportation Systems (ITS) Elements
  • Link distances
  • Free-flow speeds

Freeways

  • Number of travel lanes
  • Presence of shoulders
  • HOV lanes (if any) and operational characteristics
  • Acceleration/
    deceleration lanes
  • Grade
  • Curvature
  • Ramps

Arterials

  • Number of lanes
  • Lane usage
  • Length of turn pockets
  • Grade
  • Turning restrictions
  • Parking

Freeways

  • Ramp metering
  • Type (local, system-wide)
  • Detectors
  • Metering rates
  • Algorithms (adaptive metering)
  • Lane use signals
  • Variable speed limits

Arterials

  • Signal system description
  • Controller type
  • Phasing
  • Detector type and placement
  • Signal settings
  • Signal timing plans
  • Transit signal priority system
  • Link volume
  • Traffic composition
  • On- and off-ramp volumes
  • Turning movement counts
  • Vehicle and person trip tables
  • Traffic Management Center and surveillance system
  • Detector type
  • Detector spacing
  • Closed-circuit television
  • Information dissemination (changeable message signs) Highway advisory radio, 511 (traveler information telephone number), etc.)
  • Tolling system type and pricing mechanism (if any)
  • Data archival and dissemination
  • Incident detection and management characteristics

 

TYPES OF DATA

The types of data required can be categorized in four main areas: travel demand, traffic control, physical geometry, and ITS elements, as highlighted in table 7. Travel demand data include traffic counts, vehicle classification counts, speeds, travel times, congestion, and queuing observations. Travel demand data require the majority of the data collection effort. Traffic control data include signs, signal control, and timing plans. Physical geometry can be obtained from rectified aerial photography and base mapping files that may be prepared as part of the design effort for projects.

The four types of data can be further categorized into the following two areas:

The tenets of data collection and data management that are important for conducting an effective analytical study include the following:

Helpful tips on data collection include the following:

Key questions related to data collection include the following:

DATA SOURCES

Travel Demand

The basic demand data needed by most simulation software are the entry volumes (i.e., the travel demand entering the study area) at different points of the network. At intersections, the turning volumes or percentages should be specified.

O-D

O-D trips can be estimated from a combination of TDM trip tables and from traffic counts. O‑D data can be acquired from the local metropolitan planning organization’s (MPO) regional TDM, but these are generally 24-h estimates and, as such, need to be adjusted and refined to produce hourly or peak period estimates for use in simulation models. Typically, these estimates are further disaggregated to represent 5- or 15-min estimates required for simulation. License plate matching surveys can be used to estimate hourly trips, but this is resource-intensive. Depending on project complexity, a cost effective way of estimating O-D trips is to adjust the TDM O-D trip estimates based on field counts.

If the study area has transit, HOVs, and trucks or if there is significant interaction with bicycles and pedestrians, the corresponding demand data would be needed. In addition, vehicle dimensions and vehicle performance characteristics (e.g., maximum acceleration and deceleration) are required.

Even if only the peak periods are being examined, demand data should be collected before the onset of congestion and should continue until after the congestion has dissipated. Also, to capture the temporal variations in demand, it is best not to aggregate demand data to intervals longer than 15 min.

Vehicle Characteristics

Vehicle characteristics data can be obtained from the State transportation departments or air quality management agencies. National data can be obtained from car manufacturers, the Environmental Protection Agency, and FHWA.

Traffic Control

Data from traffic control devices at intersections or junctions are required. Control data refers to the type of control device (e.g., traffic signal, stop sign, ramp meter, etc.), the locations of these control devices, and the signal timing plans. Traffic control data can be obtained from the agencies that operate the traffic control devices in the given study area.

Traffic operations and management data on links are also needed. These include location and type of warning signs. If there are HOV lanes, information on the HOV lane requirement (e.g., HOV-2 versus HOV-3), their hours of operation, and the location of signs are needed. If there are high-occupancy toll lanes, information on the pricing strategy is required.

Operational Conditions

If there are variable message signs (VMSs) in the study area, the type of information that is displayed, the location, and, if possible, the actual messages that were displayed are needed. Most of this information can be obtained from the public agencies that are responsible for operating the VMSs. The types of signs and locations can be obtained from GIS files, aerial photographs, and construction drawings.

Event data can be received from public agencies, such as traffic management center logs. Crash databases should be verified since data may not always be recent and may not be for the specific study area. The data should be from concurrent timeframes.

Transit Data

Transit data can be obtained from the local and regional transit operators. These data can include schedules and stop locations. Calibration data could include transit automatic vehicle locator data, boarding and alighting data, and dwell time at stops.

Mobile Source Data

Mobile source data include data derived from mobile phones, Bluetooth® devices, and other mobile sources. Mobile source data are relatively new sources of data that can be used to augment other data collected. The primary types of data obtained from mobile sources are speed and travel time. In most cases, the mobile source techniques use samples of vehicles in the traffic stream, but they may not be reliable sources for traffic counts and vehicle composition.

The mobile source data are typically obtained, stored, and sold by private vendors. Before purchasing these types of data, it is a good idea to have a demonstration of the data and a means to compare the data supplied by the vendor with a real observation of what is happening in the field. Consideration of how the data are provided (format, structure, and software) should be given and should take into account how they will be used for traffic modeling and other purposes.

The following list provides a brief description of techniques and technologies that are all probe vehicle-based applications:

CHALLENGES WITH DATA

Systematically collecting the critical data, verifying data quality, and documenting any assumptions are important to justify the results of a study to decisionmakers and the public. A statistical analysis of collected and previously available data can be helpful to determine the statistical data variability and the margin of error contained in the data. Data variability can have a significant effect on the number of model runs required to represent the model’s replication of observed traffic conditions. This report provides helpful suggestions on data types, sources, and challenges. Development of a more comprehensive data quality guide is under consideration by FHWA.

Data Comprehensiveness

Comprehensive data cover different performance measures (i.e., volumes, speeds, bottlenecks, queuing, and congestion data) across freeways and arterial streets, as well as transit data and incident data. Traffic counts should be taken at key locations in the study area. Key locations include major facilities (i.e., freeway segments, major intersections and interchanges, and major on- and off-ramps). If possible, this should be done simultaneously at all key locations. Otherwise, the counts should be taken during similar timeframes with similar demand patterns and weather conditions.

If a model that was calibrated several years ago is being used, it needs to be recalibrated to reflect more current field conditions. Therefore, not only would data be needed to estimate capacity and the calibration performance measures, but demand data need to be collected. Furthermore, the analyst must verify the accuracy of geometric data, traffic control data, traffic operations, and management data.

Challenges

Due to the innovative nature of many operational strategies, collecting relevant data to support such analyses includes the following challenges:

Table 8. Margin of error for different standard deviation to mean ratios.

Sample Size (n) Z /√n Standard Deviation/Mean Ratio
0.1 0.2 0.3 0.4 0.5
2 1.385929291 13.85929 27.71859 41.57788 55.43717 69.29646
5 0.876538647 8.765386 17.53077 26.29616 35.06155 43.82693
10 0.619806421 6.198064 12.39613 18.59419 24.79226 30.99032
15 0.506069824 5.060698 10.1214 15.18209 20.24279 25.30349
20 0.438269324 4.382693 8.765386 13.14808 17.53077 21.91347
25 0.392 3.92 7.84 11.76 15.68 19.6
30 0.357845404 3.578454 7.156908 10.73536 14.31382 17.89227
35 0.331300468 3.313005 6.626009 9.939014 13.25202 16.56502
40 0.309903211 3.099032 6.198064 9.297096 12.39613 15.49516
45 0.292179549 2.921795 5.843591 8.765386 11.68718 14.60898
50 0.277185858 2.771859 5.543717 8.315576 11.08743 13.85929
55 0.264286346 2.642863 5.285727 7.92859 10.57145 13.21432
60 0.253034912 2.530349 5.060698 7.591047 10.1214 12.65175
65 0.24310808 2.431081 4.862162 7.293242 9.724323 12.1554
70 0.234264807 2.342648 4.685296 7.027944 9.370592 11.71324
75 0.226321306 2.263213 4.526426 6.789639 9.052852 11.31607
80 0.219134662 2.191347 4.382693 6.57404 8.765386 10.95673
85 0.212591849 2.125918 4.251837 6.377755 8.503674 10.62959
90 0.20660214 2.066021 4.132043 6.198064 8.264086 10.33011
95 0.201091757 2.010918 4.021835 6.032753 8.04367 10.05459
100 0.196 1.96 3.92 5.88 7.84 9.8
200 0.138592929 1.385929 2.771859 4.157788 5.543717 6.929646
300 0.113160653 1.131607 2.263213 3.39482 4.526426 5.658033
400 0.098 0.98 1.96 2.94 3.92 4.9
500 0.087653865 0.876539 1.753077 2.629616 3.506155 4.382693
Z = Value from the standard normal distribution for the selected confidence level.

DEVELOPMENT OF A DATA COLLECTION PLAN

This section describes a typical data collection plan for traffic analysis. The data collection plan will guide the collection, compilation, analysis, and archiving of data.

Approach and Work Steps

This section presents an overview of the subtasks and work steps related to the development and implementation of a data collection plan. The data collection plan should retain sufficient flexibility so that lessons learned in the compilation of data sources may be incorporated as part of the continuous improvement of the analysis effort.

Specific steps in the development of the data collection plan include the following:

  1. Research and identify available data for the study area: Existing data sources and data requirements should be used to identify available data for the analysis area. The data collection plan should also identify those individuals/‌stakeholders responsible for compiling the data. The analysis manager should work closely with stakeholders to compile the data. If possible, the analysis manager should obtain samples of the datasets prior to full collection to view the content and format of the data and adjust collection plans as necessary.

  2. Identify information/data gaps and recommend an approach to filling those gaps: Once available data sources have been investigated and dataset samples have been reviewed, the analysis manager should assess the appropriateness of the available data for use in the analysis and identify any critical gaps in data availability. Potential approaches to filling data gaps should be investigated, and recommended approaches should be documented in the data collection plan.

  3. Identify data management strategies: Procedures for conducting data quality control and data archiving should be identified. Any required thresholds for minimum data quality should be identified as well as high-level descriptions of processes for addressing data shortcomings. Plans for archiving the data should also be identified. Responsibilities for data quality testing and data archiving should be clearly defined.

  4. Develop data collection plan: The data collection plan should document all of the information listed in these steps and detail data elements to be obtained and their respective data sources. The data collection plan should outline data collection methodologies and contain budget and schedule estimates to fill data gaps.

An example outline for the data collection plan is as follows:

Once the data collection plan is developed, the required data should be collected in accordance with the plan. Generally, implementing the plan includes the following activities:

Deliverables

Major deliverables under this task include the data collection plan and the archived datasets.

Schedule

The time required to complete this task is dependent on the types, quantity, and quality of data required; the data collection methods; and the amount of readily available archived data from automated sources. The schedule for developing the data collection plan is estimated to be approximately 2–4 months. Completing the collection of data is extremely variable and may take approximately an additional 2–6 months depending on the data required.

 

Federal Highway Administration | 1200 New Jersey Avenue, SE | Washington, DC 20590 | 202-366-4000
Turner-Fairbank Highway Research Center | 6300 Georgetown Pike | McLean, VA | 22101