Skip to content U.S. Department of Transportation/Federal Highway AdministrationU.S. Department of Transportation/Federal Highway Administration

Office of Planning, Environment, & Realty (HEP)
PlanningEnvironmentReal Estate

HEP Events Guidance Publications Awards Contacts

Modifying Link-Level Emissions Modeling Procedures for Applications within the MOVES Framework

4.0 Drive Cycle Development

One of the tasks undertaken during this project involved development of drive cycles (also known as drive schedules) based on the Kansas City Emissions Study previously performed by ERG for EPA. MOVES uses real world driving cycle data and the measured emissions from associated drive cycles to model a wide range of possible driving patterns and their resultant emissions. Specifically, MOVES uses the second-by-second speed data included in its drive cycles to calculate the second-by-second vehicle specific power (VSP). It then uses this VSP, along with vehicle type and vehicle age, to weight emission rates before they are applied to the activity data to generate emissions estimates. The resulting estimates are thus based on real world driving.

Drive cycles in MOVES are classified by average speed, vehicle type, and roadway type. In this section, we discuss the methodology we used to develop our own drive cycles based on the collected Kansas City data, and the ways in which we adapted them for inclusion in the MOVES model itself. Having derived alternate drive cycles, we created MOVES modeling runs using those drive cycles for comparison to the modeling runs described in Section 2 above.

In the Kansas City Emissions Study, a large number of light-duty vehicles were equipped with data loggers. The data loggers were in operation while the vehicles were driven on a prescribed "conditioning" run (in which vehicles were driven on a set route in order to prepare them for dynamometer testing at the Kansas City site) and also while the vehicle owners operated the vehicles under regular driving conditions. The data loggers collected and stored second-by-second vehicle operating information, including speed, RPM data, mass air flow, and other data, as well as the time, and the latitude and longitude of the location of each second of driving.

The data stored by the data loggers was used to develop drive cycles that are representative of actual driving in Kansas City. The latitude and longitude coordinates of each second of driving were used to designate each second of driving as having taken place in a rural or an urban area, as well as the type of roadway (freeway, ramp, local road, etc). All of the driving data was then divided into "micro-trips", where each micro-trip is the driving that begins either at vehicle-on or after an idle period, and ends either at vehicle-off or when the vehicle returns to idle. The micro-trips were divided into bins according to road type and average speed, and then for each road type/speed bin, a vector comparison process was used to select a small number of micro-trips that best represented the overall pool of micro-trips for that bin. These selected micro-trips became the driving cycle for each road type/speed combination.

Generally speaking, the development of fleet-specific driving cycles is a resource-intensive process. There are several steps, which include:

  1. Gathering in-use driving data for the fleet of interest. This typically includes outfitting selected vehicles from the fleet with data-loggers that collect and store, at a minimum, second-by-second vehicle speed. Additionally, storage of information such as engine RPM, mass air flow, and other operational data can be useful in the process of interpreting the driving history for the vehicle. Also, if drive cycles are to be developed for different road types, then the location of the vehicle at each second of driving must be collected and stored. The data that is collected must include enough vehicles, over enough time, to develop a large pool of second-by-second trip information that can be used to represent the general driving characteristics of the fleet. If a small number of vehicles or a short-duration study is used, there is the risk of developing a drive cycle that does not represent the general fleet. This step requires qualified field personnel to install the data loggers (this is critical: improperly installed data loggers result in low quality output data), the data loggers themselves, and a pool of vehicles and drivers. The second-by-second driving should ideally include several million seconds of usable driving.
  2. Interpretation and QA/QC of the driving data. Once the second-by-second data is collected, it needs to be checked for accuracy and consistency. Among other issues, data loggers can fail intermittently, speeds can get stuck, and random noise sometimes enters the signal. Every data logger and dataset will have its own issues, which need to be discovered and dealt with before the data can be relied upon for the development of drive cycles. This typically involves the manual examination of plots of second-by-second driving data, in addition to analysis of statistics for each trip (for example, mean speed, number of seconds at idle, maximum and minimum acceleration, etc.). This process can be very time consuming, but should be done by an analyst with a good understanding of vehicle operating characteristics, and with skill in extracting pertinent information from a large and variable dataset.
  3. Conversion of the driving data into a pool of micro-trips. This step prepares the driving data for use in selecting individual micro-trips as components of a drive cycle. It requires software that is capable of matrix manipulation (such as SAS, Matlab, or similar), a system that can store and manipulate datasets on the order of one or more gigabytes, and an analyst with an understanding of matrices as well as vehicle driving characteristics.
  4. Selection of the micro-trips that will comprise the desired driving cycle(s). This involves comparing each individual micro-trip with the pool of all micro-trips and choosing the micro-trips that best represent the overall pool. Similar to step 3, this step requires software that can work with matrices, a system that can handle the dataset size, and an analyst with an understanding of matrices.
  5. Post-processing of the selected driving cycle(s). This includes checking statistics for the final cycle, looking at plots of second-by-second speed and acceleration to check for anomalous data, and, depending on the data logger used, possibly smoothing out noise from the stored driving data.

Because of the level of effort required, we expect that many users will elect not to pursue development of vehicle drive cycles specific to their area, and instead choose to use the default drive cycles provided in the MOVES model. Nonetheless, we have documented the methodology used for this study in the hopes that it may be of use to others in the future who wish to undertake similar work. The remainder of the text in this section describes procedures undertaken specifically to convert the collected Kansas City activity data into drive cycles capable of being input into MOVES.

4.1 Preparation of Raw Data

Second-by-second driving data for light duty passenger vehicles was used for this study. The data was collected in Kansas City between July 2004 and April 2005. On-vehicle data loggers recorded the second-by-second driving data, while a GPS unit recorded the latitude and longitude coordinates of the vehicle for each second. A total of 4.6 million seconds of driving data were collected and available for use in this study. The driving data includes a number of repeated trips on a specific vehicle conditioning route, as well as normal, everyday driving by the vehicle owner.

The first step in processing the driving data was to use the latitude/longitude information to determine the location of the vehicle, and therefore the type of road being driven on, for each second of driving. The next step was to identify and attempt to correct any issues with the data that would reduce its quality for use in building drive cycles. These steps are described below.

Assignment of Road Types

A Geographic Information System (GIS) was used to assign road types to the trip data points. Second-by second data points had latitude and longitude coordinates which were mapped in batches with the World Geodetic System 1984 (WGS 84). An initial quality check was done to remove obviously erroneous data, such as (0,0) points, and trips that showed no relation to the area's road network.

Urban/rural designations were obtained by mapping each data point as either within or outside of an urban area from the U.S. Census Bureau's 2000 Urbanized Areas (UAs) shapefile. 24 The U.S. Census Bureau delineates UAs to provide a better separation of urban and rural territory, population, and housing in the vicinity of large places.

Road type classification was based on ESRI's 2008 U.S. and Canada Detailed Streets layer, which came from the 2003 Tele Atlas Dynamap Transportation version 5.2 product. This detailed road network includes 10 unique road classes indicating various roadway types such as highways, local roads, and so forth. Every road segment in the data set is categorized into one of these 10 road classes (shown in Table 4-1). Spatial analysis tools in the GIS were used to identify the road segment nearest each data point and assign the segment's road class to the point.

Table 4-1. Road classification codes and definitions

Road Class Road Class Definition
1 Limited Access
2 Highway
3 Major Road
4 Local Road
5 Minor Road
6 Other Road
7 Ramp
8 Ferry
9 Pedestrian Way
0 A special class for high-level routing cross-country and in complicated urban areas.
*00 Off-grid, parking lot, etc.

However, additional revisions were required to address two primary limitations of this automated process. First, data points located close to a cross street were sometimes classified by the software as having the cross street's road class when the trip as a whole clearly continued on the primary road. These points were identified and manually assigned the correct primary road class. Second, this approach ensured that all data points were assigned a road class, even if the trip did not follow road segments. Points that did not closely follow the road network were identified and assigned a separate "00" class indicating points or trip portions that were off-grid such as points off of roads, within parking lots, or with erroneous coordinates. Figure 4-1 presents an example plot of second-by-second observations, along with classifications of nearby roadway types.

Figure 4-1. Sample Assignment of Road Types and Data Points

figure depicting a sample road network, with individual road classified by road type. Also shows individual second_by-second vehicle data points along some of the roads.

The road class definitions listed in Table 4-1 were next condensed to correlate to the five road types used by MOVES. This mapping was consistent with the correlation presented in Table 4-2.

Table 4-2. Road Classes Mapped to MOVES Road Types

MOVES Road Type Road Class Area Classification
1 - Off-Network 00 - Off Grid All
9 -Pedestrian Way All
2 - Rural Restricted Access 0 - Thru Way/Cross Country Rural
1 - Limited Access Rural
2 - Highway Rural
7 - Ramp Rural
3 - Rural Unrestricted Access 3 - Major Road Rural
4 - Local Road Rural
5 - Minor Road Rural
6 - Other Road Rural
4 - Urban Restricted Access 0 - Thru Way/Cross Country Urban
1 - Limited Access Urban
2 - Highway Urban
7 - Ramp Urban
5 - Urban Unrestricted Access 3 - Major Road Urban
4 - Local Road Urban
5 - Minor Road Urban
6 - Other Road Urban

Removal of Waiting Times at Test Center

Having assigned road types to the raw driving data on a second-by-second basis, the next step in our methodology was to begin the QA/QC process for the driving data. The first issue that was discovered was that many of the vehicle conditioning trips in the dataset included various periods of idle, along with some sporadic low-speed driving in the parking lot. This occurred both prior to driving of the designated test cycle, and also after the driving for the designated test cycle was completed. These instances were identified both by their latitude and longitude, and by their designation as being part of the "off-grid" road class, and thus were removed from the dataset. Any disconnected fragments of driving that were left at the beginning or the end of the trip after the removal of the parking lot periods were also removed from the dataset. This resulted in the deletion of 475,000 seconds from the dataset. An example of the type of data that was removed is shown in Figure 4-2. The data removed is represented by the red line, while the data kept is shown by the black line.

Figure 4-2. Waiting Times at Test Center

A plot of speed versus time for a sample vehicle. Speeds range from 0-70 mph over approximately 40 minutes. The last 7-8 minutes of the plot is color coded to highlight a speed of zero.

Blocks of Time with Stuck Speeds

Under normal driving conditions, it is rare for a vehicle to travel at the exact same speed for more than a few seconds. Even while driving on the freeway, a vehicle's speed tends to vary. In the Kansas City dataset, some blocks of time were identified for certain vehicles in which vehicle speed stayed constant for extended, improbable stretches. Additionally, a few blocks of time were found where the speed decreased perfectly evenly over several minutes, by a tenth of a mile per hour ever few seconds. Visual examination of the speed versus time history around these stretches indicated that they were not representative of real driving, but rather a stuck voltage or other data logger malfunction. Thus, any blocks of driving where the speed either remained constant for more than 12 seconds, or decreased perfectly steadily, were deleted from the dataset. Ultimately 38,000 seconds were deleted for these reasons. An example of a data logger with a stuck speed is shown in Figure 4-3. The driving with the stuck speed is shown in red.

Figure 4-3. Data Logger with Stuck Speed

A plot of speed versus time for a sample vehicle. Speeds range from 0-70 mph over approximately 20 minutes. The first 3 minutes of the plot is color coded to highlight a constant speed of 14 mph.

False Trips

In some instances, a number of very short trips were recorded that appeared to be electronic "blips", as opposed to actual driving. For these, the trip duration was very short, speed increased gradually and then decreased only once (as opposed to repeated increases and decreases in speed that are usually seen), and the peak speed was very low - always less than 20 mph, but usually much lower than that. Therefore, a total of 1,211,000 seconds comprising these false trips were deleted. An example of this type of false trip is shown in red in Figure 4-4.

Figure 4-4. False Trips in Drive-Away Driving

A plot of speed versus time for a sample vehicle. Speeds range from 0-1 mph over approximately 2.5 minutes.

Engine Off Times

During the Kansas City study, the data loggers continued to record vehicle data even when the vehicle was turned off. Usually, engine RPM would be useful in the determination of whether the vehicle is at idle, or turned off; however, engine RPM was not always successfully stored by the data loggers in this study. Mass air flow (MAF) was stored, and was considered for use in differentiating between engine off and idle periods, but it was found that the MAF signal was very noisy, so that even when the vehicle engine was off, the MAF signal fluctuated above zero. Thus the vehicle's speed was the only useful variable in determining whether the vehicle was on or off. Any zero speed traces at the beginning of a trip, before any driving was done, and at the end of a trip, after all driving was done, were deleted from the dataset, which accounted for 1,306,000 seconds of data. This may have resulted in the accidental deletion of some periods where the vehicle was actually idling, rather than turned off, but this should not significantly affect the final cycles. After all of the edits discussed above were made, the remaining dataset contained records for 1,596,000 seconds of data.

Assignment of Trip and Micro-Trip Starts

The driving cycles were created as a set of "micro-trips" with speed and acceleration characteristics that best matched the speed and acceleration of driving in the entire dataset. A micro-trip is defined as a contiguous speed trace of vehicle driving, and is made up of an engine idle, followed by all non-idle driving until the next idle begins. A single vehicle trip may be composed of numerous micro-trips. Each separate trip in the raw Kansas City dataset was labeled with a unique ID by the data logger, and these trip designations were not changed. The next step in the process was to sub-divide the driving in the raw data into micro-trips before the cycles were built. We established the beginning of a new micro-trip:

Whenever a new micro-trip was detected, the numeric identifier for the micro-trip was incremented. Micro-trip numbers for the entire dataset were unique.

Binning of Continuous Variables

To use the cycle development approach discussed below in Section 4.2, all of the micro-trips in the edited dataset needed to have all of their second-by-second observations binned in terms of speed and acceleration. While the size of the bins is arbitrary, bins in general need to be narrow enough to resolve important emissions effects. In addition, bins need to be sufficiently narrow to distinguish different micro-trips for low speed/low acceleration micro-trips where those variables do not vary over a large range. On the other hand, from a practical perspective, the number of bins needs to be small so that the program that selects micro-trips can run in a reasonable amount of time.

For the cycle development in this project, we used the following binning schemes to bin the data:

Criteria for Skipping Micro-Trips for a Cycle

Three types of micro-trips were excluded from use in the candidate cycles. First, some micro-trips consisted of entirely idle operation. These micro-trips were not used since a dedicated idle cycle was not needed for use in the MOVES model.

Second, any micro-trips less then 20 seconds in duration were not considered for inclusion in the cycles. The reason for not including these is that many short micro-trips can be produced by common, but non-representative, operation of the vehicle. One example of such operation is when a vehicle starts moving from a standstill, but the engine dies because the clutch is let out too quickly. We have found in this study, as well as in past studies, that micro-trips longer than 20 seconds are adequate to describe the vehicle driving behavior of the entire dataset taken as a whole.

The third type of micro-trip that was excluded from consideration for the cycles was any micro-trip with only one non-zero second of driving.

Assignment of Micro-Trips to Road Type and Speed Bins

For this study, our intent was to create an array of drive cycles for driving on different road types and at different average speeds, similar to the array of cycles that make up the defaults drive cycles in MOVES. Initially, we proposed to developed four cycles: Rural Restricted, Rural Unrestricted, Urban Restricted, and Urban Unrestricted, corresponding to the four on-network road types contained in the MOVES model. However, once the dataset QA/QC process was complete, it was clear that there was far more urban data available than rural, and that there were not enough rural micro-trips to use as a pool from which to derive a driving cycle. Therefore, it was decided to divide the data among only two road types: Restricted (freeways, ramps, etc.), and Unrestricted (local roads).

Even with only two categories for the road types, it was found that the majority of micro-trips included driving on both of the two road types. In large part, this issue was an artifact of using latitude/longitude coordinates to determine what road that the vehicle was on. It was common to see in the dataset stretches of driving that was almost entirely on one road, except that every few seconds, a single record was seen with a different road name. These turned out to be the intersections that the vehicle was driving through, because the coordinates for an intersection are the same for either of the cross streets. Since we needed to assign each micro-trip to a single bin, the assignments were made based on the road type that was found for the majority of the seconds of the micro-trip. A typical mixed-type micro-trip is shown in Figure 4-5. Restricted (highway) driving is shown by the black line, while unrestricted-road (local) driving is shown in red. The figure shows that the predominant driving in the micro-trip is on the highway, with brief switches to unrestricted roads, which can only be explained by intersections crossing the highway (either over it, or under it). Data processing for work such as this will be greatly facilitated, and the accuracy of road-type assignments will be improved, when software to correlate road names with GPS coordinates is developed to look at the vehicle's trajectory in addition to its latitude and longitude.

Figure 4-5. Highway Driving Interspersed with Unrestricted Cross Streets

A plot of speed versus time for a sample vehicle. Speeds range from 0-80 mph over approximately 40 minutes. Restricted (highway) driving is shown by the black line, while unrestricted-road (local) driving is shown in red. The driving shifts back and forth between restricted and unrestricted roads very frequently, with small gaps present in the data.

After each micro-trip was assigned to a specific MOVES road type, we then created speed bins. Each micro-trip was assigned to one speed bin, based on the mean speed for the micro-trip. Since an extended idle period at the beginning of a micro-trip could dramatically reduce the mean speed for the micro-trip, all idle periods were separated from the non-idle portion of the dataset.

The speed bins that we initially attempted to use in the binning process were designed to match the mean speeds for the driving cycles for light duty vehicles that currently exist in MOVES. However, Table 4-3 shows that these bins did not match the distribution of the driving data for Kansas City, in that the lower bins contain most of the micro-trips, while the higher-speed bins are under populated. Therefore, a second set of speed bins was developed, with the intent of dividing the bins such that each would contain approximately the same amount of data. The second set of speed bins is listed in Table 4-4, and these were the speed bins that were ultimately used. One drive cycle was developed for each of the twelve road type/mean speed bin combinations in Table 4-4. Comparison of the two tables shows that the Kansas City drive cycles will consist of generally lower average speeds than the MOVES drive cycles currently do. It is possible that this discrepancy can be attributed to the use of latitude/longitude to determine an appropriate road type. That is, some of the micro-trips with lower mean speeds that are labeled as Restricted do not have the cruising speed characteristics and long duration that are typically expected of freeway driving, and some of these micro-trips might actually be misclassified Unrestricted-road driving. (In previous analyses, prior to the availability of GPS coordinates associated with driving activity, micro-trips were assigned as restricted/unrestricted based only the mean speed of the trip.)

The idle data that was separated from the non-idle driving data was then evaluated to determine the average idle length for micro-trips from each of the road type/speed bin combinations. It was found that the idle length was close to 20 seconds for all of the road type/speed bin combinations. That amount of idle time was added back to each micro-trip for the drive cycle.

Table 4-3. First Pass Speed Bins, Based on Default MOVES Drive Cycle Speed Ranges

Road Type Speed Range (mph) Number of MicroTrips Total Seconds of Driving
Restricted 00.0-22.5 933 51,516
Restricted 22.5-37.5 1868 217,178
Restricted 37.5-50.0 481 142,060
Restricted 50.0-60.0 258 222,670
Restricted 60.0-70.0 33 32,284
Restricted 70 + 1 109
Unrestricted 00.0-15.0 917 40,154
Unrestricted 15.0-25.0 2136 262,541
Unrestricted 25.0-35.0 1308 343,568
Unrestricted 35.0-47.5 164 42,206
Unrestricted 47.5-60.0 18 12,656

Table 4-4. Final Speed Bins, Based on Uniform Distribution of Trips Across Speed Ranges

Road Type Speed Range (mph) Number of MicroTrips Total Seconds of Driving
Restricted 00-20 543 30,222
Restricted 20-30 1610 123,354
Restricted 30-40 857 160,872
Restricted 40-50 272 96,306
Restricted 50-60 258 222,670
Restricted 60+ 34 32,393
Unrestricted 00-15 917 40,154
Unrestricted 15-20 815 88,395
Unrestricted 20-25 1321 174,146
Unrestricted 25-28 600 165,059
Unrestricted 28-32 555 145,134
Unrestricted 32+ 335 88,237

4.2 Drive Cycle Development

Representative drive cycles can be built from raw driving data using different methodologies. The methodology we have chosen for this study's cycle is to use pieces of real driving, called micro-trips, derived from available second-by-second driving data, and processed using a series of SAS programs developed by ERG. 25 When these micro-trips are connected together, they can be expected to represent driving behavior from the area of interest. The drive cycles we built were based on parameters of vehicle operation and usage that are known to be closely related to exhaust emissions. By using this approach of matching vehicle operation between measured driving behavior and the candidate cycle, it can be inferred that the emissions behavior of vehicles over the cycle will be similar to the emissions behavior of vehicles on the road.

In the creation of these driving cycles, we have chosen vehicle speed and acceleration as the variables that are important to exhaust emissions. These variables together provide a measure of the load on the engine, which is an important variable associated with exhaust emissions. In this study, we are building a cycle only for warmed up operation of light duty vehicles. That is, we are not building special cycles for cold starts and warm starts 26. We assume that all data in the datasets represent warmed-up driving.

General Methodology

A strategy based on minimizing the difference between a cycle vector C representing the driving in the candidate cycle and a target vector T representing the driving in the activity database for the case was used to select micro-trips from the database for inclusion in the cycle. As micro-trips are used to build-up a candidate cycle, the difference between the two vectors tends to become smaller and smaller. The build-up process ends when the cycle developer decides that the two vectors are substantially the same and the duration of the cycle that has been built up is acceptable. The multi-dimensional space that these vectors are in will be described shortly, but first let us consider how the build-up process works for developing a cycle.

The goal of building the cycle is to select micro-trips such that when their vectors M1 are added together, the vector C of the resulting cycle is as similar as possible to the target vector T of the activity database. Figure 4-6 shows the hypothetical situation of the vectors after two micro-trips have been used to create a cycle. In this hypothetical example, the first micro-trip was selected from the activity database for the case as the one whose vector M1 was closest to the target vector T for the database. Then, a second micro-trip is searched for such that when its vector M2 is added to M1 to create the resultant vector C shown in Figure 3-1, the distance between the tips of C and T is minimized. This distance is the length of the vector T-C as denoted in the figure by the dashed vector. As micro-trips are added to create the built-up cycle represented by C, the length of T-C is calculated after each additional micro-trip is added to the cycle to follow the progress of the build-up process. It should be noted that the order of the micro-trips in the final cycle is unimportant from the point of view of the selection of the micro-trips. The reason for this is that the resultant C is independent of the order in which the micro-trip vectors Mi are added together.

Figure 4-6. Vector Description of Comparing Target and Cycle Activity

A figure showing several vectors. Vectors M1 and M2 are summed to create a resultant bolded vector C. The vector C is displayed along with a target bolded vector, T; The difference between these two vectors is indicated by a dashed vector, T-C.

It should also be noted that we are forcing micro-trips to be added to the candidate cycle. This is done even if the addition of the best incremental micro-trip causes the length of T-C to increase in some instances. Generally, as the cycle is built up there will be a decrease in the length of T-C. After several micro-trips have been added, the length of T-C may increase slightly. Later, with the addition of more micro-trips, a "discovery" will be made that will produce a relatively abrupt decrease in the length of T-C so that the accumulated cycle will be substantially better than the cycle was much earlier in the build-up process.

All of the vectors used above to describe the build-up process are based on representations of the frequency distributions of observations in cumulative speed, acceleration, vehicle specific power space. This statement requires some explanation. A segment of driving, whether it is a micro-trip, a piece of a driving cycle, or the entire activity database, can be described as a frequency distribution. The distribution consists of combinations of two variables: speed and acceleration. The continuous values for these variables were converted into frequency distributions through the use of bins. Each observation in the database was placed in a particular speed/acceleration bin. The cumulative frequency distribution is made up of the number of observations that fall "below" the current bin for each of the two-binned variables. The binning criteria for the two variables is described above in Section 4.1. To help the reader understand the process, we will present a numerical example in one dimension and another example in two dimensions to demonstrate how the comparison of the vectors T and C works.

Suppose we wanted to compare a candidate cycle with the database using a single vehicle operation variable that was monitored second-by-second in the collection of data for the activity database. The single variable might be engine load. In this hypothetical example, we have 35,900 one-second observations of engine load in the target activity database and 68 one-second observations in the cycle. The first step in comparing T and C is to bin the observations of load in the target data and in the cycle data. Table 4-5 shows the binning of the hypothetical data in Columns 2 and 3. Note that the target counts in Column 2 are much larger than the cycle counts in Column 3. This is a consequence of the activity database containing all of the observations for all micro-trips, while the cycle has just one micro-trip. The counts in Columns 2 and 3 were converted to cumulative counts in Columns 4 and 5. This is done to provide proximity information for the micro-trip searching algorithm. In other words, we wanted the algorithm to be able to select a micro-trip even if the observations for a given micro-trip were not in exactly the same bins as the target, but did have observations at least in a nearby bin. The use of the cumulative distributions helps ensure that proximity information is available.

Table 4-5. Comparison of Cycle and Target Vectors for a Hypothetical One-Dimensional Example

  Counts Cumulative Counts Vector
(Normalized Cumulative Counts)
Vector Length
Bin Target Cycle Target Cycle Target Cycle T C T-C
1 1000 0 1000 0 0.028 0.000 1.246 1.266 0.138
2 11000 30 12000 30 0.334 0.441      
3 7000 10 19000 40 0.529 0.588      
4 6000 7 25000 47 0.696 0.691      
5 4500 5 29500 52 0.822 0.765      
6 2800 1 32300 53 0.900 0.779      
7 1500 4 33800 57 0.942 0.838      
8 800 6 34600 63 0.964 0.926      
9 600 1 35200 64 0.981 0.941      
10 700 4 35900 68 1.000 1.000      

A comparison of the cumulative counts for the target and cycle information in Columns 4 and 5 shows that if we used these counts to create the T and C vectors, the lengths of the vectors would be greatly different simply because the target vector, which is made up of the 10 elements in Column 4, would be a much longer vector then the cycle vector, which is made up of the 10 elements in Column 5. Accordingly, we normalize the target and cycle cumulative counts in 4 and 5 to produce the target vector elements and the cycle vector elements as the fractional values between 0 and 1 shown in Columns 6 and 7.

The values in Columns 6 and 7 become the elements of the T and C vectors, which are in 10-dimensional space. A visualization of the elements of these vectors is provided in Figure 4-7. This figure shows the normalized cumulative counts of the target and cycle from Columns 6 and 7 as a function of the bin number. What we want to do in developing the cycle is select micro-trips so that the curve for the cycle is as close as possible to the curve for the target in this figure. The way we do this is to minimize the sums of the squares of the differences between the value for the corresponding elements of the target and cycle vectors. This corresponds to the square of the length of T-C. Table 3-1 shows the calculated length of T, C, and T-C. These lengths can be determined from the values of the elements for T and C in Columns 6 and 7 using the standard relationship for determining the length of a vector if its elements are known.

Figure 4-7. Visual Comparison of Vector Elements

A plot of normalized cumulative counts (ranging from 0 to 1.0) versus bins (1-10) for both the target and the cycle. The target plot follows a parabolic curve from bin 1 to 10, sharply increasing at bin 1, and gradually flattening as it approaches bin 10. The cycle plot roughly follows the target plot, with some minor deviation.

Extension of the one-dimensional example shown in Table 4-5 and Figure 4-7 to multiple dimensions is demonstrated by the spreadsheet calculations shown in Table 4-6. In this example, 100 matrix elements are used. The table shows 10 rows which might be accelerations and 10 columns which might be speeds. The left side of Table 3-2 shows the calculations for the target matrix and the right side shows the calculations for the cycle matrix. In Tables a) and b), the second-by-second observations of the target and cycle data are binned. The numbers in each bin represent the frequency of observations that meet the criteria for those bins. In Tables c) and d), the counts in the Tables a) and b) are accumulated across each row. Then, in Tables e) and f), the accumulated frequencies in Tables c) and d) are accumulated down each column. This produces a field of frequencies on a cumulative basis that run from a low value in the upper left corner of each matrix to a high number in the lower right corner of each matrix. The value in the lower right hand corner of Tables e) and f) is equal to the total number of observations in the target or cycle matrix. These total observation numbers in the lower right hand corner of e) and f) are used to normalize all of the frequencies in Tables e) and f) to arrive at the normalized cumulative matrices in g) and h). The values in g) and h) are then used to calculate the square of the differences in each corresponding matrix element to produce the values in Table i). The value in Table j) is just the summation of all of the elements of Table i) and represents the square of the length of the T-C vector. This is the value that we attempt to minimize when selecting micro-trips for the cycle.

Note that the counts in a) and b) did not need to be in corresponding bins for this comparison process to work. The use of cumulative distributions permitted the two matrices to be compared successfully.

Extension of the technique to the third dimension for vehicle specific power or any number of higher dimensions could be made by analogy.

Table 4-6. Comparison of Cycle and Target Matrices for a Hypothetical Two-Dimensional Example

  Target Activity Matrix
  a) Count the second-by-second observations in each bin.
  A B C D E F G H I J
1 2                  
2   1                
3   2   5            
4     5   3   2 1    
5   5   9 1     2 9 3
6     2     4 1      
7                    
8     6     1        
9   1                
10                    
Cycle Activity Matrix
b) Count the second-by-second observations in each bin.
A B C D E F G H I J
                 
1                
    4            
4       3   1    
            4 1  
    8           2
      3          
          1      
1 5                
                 
   
  c) Accumulate the above frequencies across each row.
  A B C D E F G H I J
1 2 2 2 2 2 2 2 2 2 2
2 0 1 1 1 1 1 1 1 1 1
3 0 2 2 7 7 7 7 7 7 7
4 0 0 5 5 8 8 10 11 11 11
5 0 5 5 14 15 15 15 17 26 29
6 0 0 2 2 2 6 7 7 7 7
7 0 0 0 0 0 0 0 0 0 0
8 0 0 6 6 6 7 7 7 7 7
9 0 1 1 1 1 1 1 1 1 1
10 0 0 0 0 0 0 0 0 0 0
 
d) Accumulate the above frequencies across each row.
A B C D E F G H I J
0 0 0 0 0 0 0 0 0 0
0 1 1 1 1 1 1 1 1 1
0 0 0 4 4 4 4 4 4 4
0 4 4 4 4 7 7 8 8 8
0 0 0 0 0 0 0 4 5 5
0 0 0 8 8 8 8 8 8 10
0 0 0 0 3 3 3 3 3 3
0 0 0 0 0 0 1 1 1 1
1 6 6 6 6 6 6 6 6 6
0 0 0 0 0 0 0 0 0 0
   
  e) Accumulate the above frequencies down each column.
  A B C D E F G H I J
1 2 2 2 2 2 2 2 2 2 2
2 2 3 3 3 3 3 3 3 3 3
3 2 5 5 10 10 10 10 10 10 10
4 2 5 10 15 18 18 20 21 21 21
5 2 10 15 29 33 33 35 38 47 50
6 2 10 17 31 35 39 42 45 54 57
7 2 10 17 31 35 39 42 45 54 57
8 2 10 23 37 41 46 49 52 61 64
9 2 11 24 38 42 47 50 53 62 65
10 2 11 24 38 42 47 50 53 62 65
 
f) Accumulate the above frequencies down each column.
A B C D E F G H I J
0 0 0 0 0 0 0 0 0 0
0 1 1 1 1 1 1 1 1 1
0 1 1 5 5 5 5 5 5 5
0 5 5 9 9 12 12 13 13 13
0 5 5 9 9 12 12 17 18 18
0 5 5 17 17 20 20 25 26 28
0 5 5 17 20 23 23 28 29 31
0 5 5 17 20 23 24 29 30 32
1 11 11 23 26 29 30 35 36 38
1 11 11 23 26 29 30 35 36 38
   
  g) Normalize the elements in the above matrix.
  A B C D E F G H I J
1 0.031 0.031 0.031 0.031 0.031 0.031 0.031 0.031 0.031 0.031
2 0.031 0.046 0.046 0.046 0.046 0.046 0.046 0.046 0.046 0.046
3 0.031 0.077 0.077 0.154 0.154 0.154 0.154 0.154 0.154 0.154
4 0.031 0.077 0.154 0.231 0.277 0.277 0.308 0.323 0.323 0.323
5 0.031 0.154 0.231 0.446 0.508 0.508 0.538 0.585 0.723 0.769
6 0.031 0.154 0.262 0.477 0.538 0.600 0.646 0.692 0.831 0.877
7 0.031 0.154 0.262 0.477 0.538 0.600 0.646 0.692 0.831 0.877
8 0.031 0.154 0.354 0.569 0.631 0.708 0.754 0.800 0.938 0.985
9 0.031 0.169 0.369 0.585 0.646 0.723 0.769 0.815 0.954 1.000
10 0.031 0.169 0.369 0.585 0.646 0.723 0.769 0.815 0.954 1.000
 
h) Normalize the elements in the above matrix.
A B C D E F G H I J
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
0.000 0.026 0.026 0.026 0.026 0.026 0.026 0.026 0.026 0.026
0.000 0.026 0.026 0.132 0.132 0.132 0.132 0.132 0.132 0.132
0.000 0.132 0.132 0.237 0.237 0.316 0.316 0.342 0.342 0.342
0.000 0.132 0.132 0.237 0.237 0.316 0.316 0.447 0.474 0.474
0.000 0.132 0.132 0.447 0.447 0.526 0.526 0.658 0.684 0.737
0.000 0.132 0.132 0.447 0.526 0.605 0.605 0.737 0.763 0.816
0.000 0.132 0.132 0.447 0.526 0.605 0.632 0.763 0.789 0.842
0.026 0.289 0.289 0.605 0.684 0.763 0.789 0.921 0.947 1.000
0.026 0.289 0.289 0.605 0.684 0.763 0.789 0.921 0.947 1.000
   
  i) Calculate the squares of the differences in corresponding elements of the above two matrices.
  A B C D E F G H I J
1 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001
2 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
3 0.001 0.003 0.003 0.000 0.000 0.000 0.000 0.000 0.000 0.000
4 0.001 0.003 0.000 0.000 0.002 0.002 0.000 0.000 0.000 0.000
5 0.001 0.000 0.010 0.044 0.073 0.037 0.050 0.019 0.062 0.087
6 0.001 0.000 0.017 0.001 0.008 0.005 0.014 0.001 0.021 0.020
7 0.001 0.000 0.017 0.001 0.000 0.000 0.002 0.002 0.005 0.004
8 0.001 0.000 0.049 0.015 0.011 0.010 0.015 0.001 0.022 0.020
9 0.000 0.014 0.006 0.000 0.001 0.002 0.000 0.011 0.000 0.000
10 0.000 0.014 0.006 0.000 0.001 0.002 0.000 0.011 0.000 0.000
j) Sum the squares of the differences.
0.754

4.3 Development of the Twelve Drive Cycles for Inclusion in MOVES

The dataset containing all of the second-by-second driving activity for the Kansas City vehicles was edited and prepared as described in Section 4.1, resulting in a set of micro-trips distributed into twelve bins according to mean speed and dominant road type. The approach described in Section 4.2 was then used to select the micro-trips that would comprise a drive cycle for each of the twelve bins. The micro-trip speed and acceleration information was used to find those micro-trips which, when concatenated, best described the driving activity for each bin. For each bin, ERG's SAS-based cycle development program converted each micro-trip into a speed/acceleration vector and the entire set of micro-trips for that bin into the speed/acceleration target vector T. The program then found the micro-trip whose sum of the squares difference between the cumulative normalized elements of the micro-trip with the corresponding elements of the target was the smallest. This corresponded to finding the micro-trip such that the T-C vector was the smallest. This became the first micro-trip in the cycle, for that bin. Then, the program looked through all remaining micro-trips from the bin to find the best second micro-trip such that when it was added to the first micro-trip the new vector T-C had a minimum length. This process may be repeated until the developer wants to stop searching. In this study, we stopped searching after 25 micro-trips were added to a cycle.

Finally, a 20-second idle period was added to the beginning of each micro-trip, representing the mean idle time that was found for micro-trips from each road type/ speed bin combination.

Plots of the square of the length of the T-C vector as micro-trips were added to the cycle for each bin are presented in Appendix B in Figures B-1a through B-1l. These are the sum of the squared differences between the cycle vector C and the target vector T. The left vertical axis of each figure shows that the square of the length of the T-C vector drops continuously as additional micro-trips are added to the cycle (red line). The right vertical axis of each figure shows the cumulative duration of the cycle as additional micro-trips are added to the cycle (dashed black line). What these figures reveal is that none of the cycles really need all 25 micro-trips to adequately represent the driving for the bin. For each of the twelve cycles, little additional benefit was achieved by adding micro-trips after about the 15th trip; the sum of the squared differences between the vectors did not continue to decrease after that point. Additionally, the use of 25 micro-trips tended to make the final cycle much longer than the cycles of 1000 seconds or so that are currently used in MOVES. This was especially true for the higher-speed bins, whose micro-trips tended to be much longer than the micro-trips for the lower-speed bins. Therefore, we elected to use only the first 15 micro-trips for each of the cycles, with the exception of the two highest speed cycles. These are the Restricted 50-60 cycle, for which only the first five micro-trips would be used, and the Restricted 60+ cycle, for which only the first seven micro-trips would be used. These two cycles contain micro-trips that are very long, but the figures show slight minimums in the values of the sum of the squared differences, at the fifth and the seventh micro-trip, respectively.

Speed versus Time Traces for Candidate Cycles

After the micro-trips for the twelve cycles were selected, several evaluations of the cycles were made, including comparisons of the cycle to the target dataset. First, we examined a speed versus time plot of the micro-trips that make up each cycle, shown in Appendix B in Figures B-2a through B-2l. The small circles on the plot indicate the beginning of each micro-trip. These candidate cycle plots were used to examine the overall appearance of the cycle and to show the duration of the cycle. These figures show that the micro-trips for the different road types and speed bins clearly represent different types of driving.

Comparison of Statistics for Cycle and Target

We also made comparisons of the chosen cycles to the corresponding target datasets. In particular, we wished to confirm that the speed and acceleration characteristics of the two were similar. We first examined scatter plots of acceleration versus speed for the candidate cycle and for the target dataset. Rather than show all 24 plots here, only a sample are shown in Appendix B in Figures B-3a through B-3d: a comparison of speed versus acceleration for the cycle and the target for the Restricted 50-60 MPH bin, and for the Unrestricted 25-28 MPH bin. All of the data points for the cycle are shown on the two cycle plots, but due to the large size of the target databases, only a random subset of the target data is shown on either of the two target plots. Comparison of Figures B-3a and B-3b shows that the Cycle and the Target contain a very similar speed versus acceleration profile, with the highest density of data between speeds of 60 and 70 MPH and accelerations between -2 to 2 MPH/S, somewhat lighter concentration between 30 and 40 MPH. However, while the target database contains quite a few observations with speeds near 80 MPH, there are fewer of these high speeds in the cycle. For Figures B-3c and B-3d, the cycle and the target again look similar. Here the densest area of points is at about 30 MPH, and again, we see more observations at the highest speeds (above 45 MPH) in the target than in the cycle. It appears that a few trips in the target bins contain outliers: short amounts of especially high speed (for that bin) driving. Because they are few in number, they do not comprise the majority of the driving for the bin, and do not get selected for inclusion in the cycle. This was observed for most of the 12 bins, and has been seen in ERG's previous cycle building projects as well.

The frequency distributions of speeds found in the new cycles and the corresponding target databases are shown in Figures B-4a through B-4d. Again, this is only a sample, for the Restricted 50-60 MPH bin and the Unrestricted 25-28 MPH bin. From these figures we see once more that while the speed distributions for the cycles and the targets are similar, the target does contain a few slightly higher speeds than the cycle.

A number of statistics were calculated so that the characteristics of the cycles could be compared to the characteristics of the respective target datasets. These statistics are listed in Table 4-7. It is important to remember when comparing any of these statistics that the micro-trips in the each cycle were selected only because their non-idle speed and acceleration characteristics match those of the target. Any other statistics that are calculated and compared were not the basis, or at least not the direct basis, for choosing the micro-trips for the cycles.

Table 4-7 shows that the average second-by-second speeds of the selected cycle and target database are similar, although the cycle speeds are slightly lower than those in the target dataset. This corresponds to what was observed in Figures B-3 and B-4 in Appendix B: there are a few especially high speeds in each target bin that are not representative of the target bin as a whole, and thus were not selected for inclusion in the cycles. They do, however, raise the mean speed of the target slightly above that of the cycle. Finally, Table-4-7 shows that the square of the length of T-C was very low for each of the cycles. This indicates a good fit of the driving conditions (speed and acceleration) in the selected cycle, compared to the target database.

Table 4-7. Comparison of Dataset and Cycle Operation Characteristics

  Target Cycle
  Average Speed (mph) Average Micro-Trip Time (s) Average Micro-Trip Distance (miles) Average Speed (mph) Average Micro-Trip Time (s) Average Micro-Trip Distance (miles) Total Time (s) Total Distance (miles) Micro-Trips (count) Final Square of L. of T-C
Rest. 0-20 16 56 0.25 12 75 0.26 1127 3.91 15 0.012
Rest. 20-30 26 77 0.55 20 87 0.49 1300 7.38 15 0.013
Rest. 30-40 35 188 1.85 32 189 1.70 2842 25.45 15 0.013
Rest. 40-50 45 354 4.43 42 320 3.77 4801 56.54 15 0.015
Rest. 50-60 55 863 13.12 54 834 12.43 4170 62.15 5 0.022
Rest. 60+ 63 953 16.57 61 713 12.05 4989 84.37 7 0.054
Unrst. 0-15 9 44 0.11 6 62 0.11 924 1.66 15 0.028
Unrst. 15-20 18 108 0.55 15 94 0.38 1405 5.76 15 0.013
Unrst. 20-25 23 132 0.83 19 109 0.57 1634 8.60 15 0.011
Unrst. 25-28 27 275 2.03 24 181 1.20 2713 18.00 15 0.011
Unrst. 28-32 30 262 2.15 27 186 1.38 2783 20.72 15 0.01
Unrst. 32+ 39 263 2.86 35 190 1.85 2845 27.73 15 0.037

MOVES Modeling of Kansas City Drive Cycles

Drive cycles within the MOVES database are stored in three tables of interest: driveschedule, drivescheduleassoc, and driveschedulesecond. Per discussions with EPA staff 27, it was necessary to completely replace the existing MOVES light duty drive cycles, in order to avoid potential interpolation issues during model calculations. This was done by removing selected light duty drive cycles from the default database, and importing the new drive cycle data described above using completely new drivescheduleIDs.

First, using the default driveschedule table as a basis, we eliminated all existing light duty drive cycles with the exception of drivescheduleIDs 101 and 199. In this case, the former was intended to simulate very low speeds (average 2.5 mph), and the latter was intended to simulate activity on ramps. Both instances were such that we did not feel they could be accurately simulated with the Kansas City drive cycles we developed dataset, so they were left intact. The driveschedule table was then populated with the proper data.

Similarly, the default drivescheduleassoc and driveschedulesecond tables were cleansed of existing light duty cycles, and new information was imported. In the case of drivescheduleassoc, Restricted cycles were associated with MOVES roadtypeIDs 2 and 4, while Unrestricted cycles were associated with MOVES roadtypeIDs 3 and 5. Associations were made for MOVES sourcetypeIDs 11, 21, 31, and 32, which correspond to motorcycles, passenger cars, passenger trucks, and light commercial trucks, respectively. These associations are consistent with the default drive cycles removed from the MOVES database. In the case of driveschedulesecond, new data was added for each cycle on a second-by-second basis, using speeds calculated during the development process described above.

Having populated updated tables containing new Kansas City study-based cycles for light duty vehicles, it was necessary to then replace each of the existing MOVES default drive cycle tables and perform additional modeling. Before doing so, ERG used the MySQL Administrator program, which is typically included with during a normal MOVES installation, to make a SQL backup of the existing three tables of interest. This is important for two reasons: first, to have a backup of the affected tables in case there is an error during the import of the new drive cycles tables; and secondly, to be able to easily switch back to the default MOVES drive cycles for future modeling. Users should be aware that alteration of drive cycle tables in this way will affect all future modeling runs done on an individual computer system, so it's crucial to backup the default drive cycles in such a way that they can be easily restored. 28

Loading of the updated drive cycle tables was done with a simple SQL script, in which the existing default tables were truncated, and the new tables were imported in the form of .tab-delimited text files. Having successfully loaded the new tables, a backup was made again, using MySQL administrator, for ease of switching between drive cycle sets. Finally, the TDM-based and HPMS-based MOVES runs described in Sections 2 and 3 were re-created and re-run using the updated drive cycles to provide additional model outputs for further analysis. Comparisons of model outputs are discussed in Sections 5 and 6 below.

For reference, all of the MOVES database tables and SQL scripts described in this section are provided as electronic files in Appendix A.


24 http://www.census.gov/geo/maps-data/data/tiger-cart-boundary.html

25 The methodology used in this study was previously developed in ERG's Roadway-Specific Driving Schedules for Heavy-Duty Vehicles, prepared for EPA in August 2003.

26 Drive cycles were only developed for warmed up operations because MOVES exhaust emissions are based on warmed up operations. Emissions associated with vehicle starts are calculated in MOVES using a methodology that is not connected with drive cycles.

27 Note that we are not aware of any formal guidance from EPA on replacement of drive cycles in the model. We consulted directly David Brzezinski and Sean Hillson at EPA during this stage of our study.

28 Note that the most current version of MOVES at this writing, MOVES2010a, allows users to alter the drive cycle tables discussed in this section via the County Data Manager GUI, rendering obsolete the procedure described here for importing alternate drive cycles into MOVES.

Updated: 04/04/2014
HEP Home Planning Environment Real Estate
Federal Highway Administration | 1200 New Jersey Avenue, SE | Washington, DC 20590 | 202-366-4000
Turner-Fairbank Highway Research Center | 6300 Georgetown Pike | McLean, VA | 22101