Skip to content U.S. Department of Transportation/Federal Highway AdministrationU.S. Department of Transportation/Federal Highway Administration

Office of Planning, Environment, & Realty (HEP)
Planning · Environment · Real Estate

HEP Events Guidance Publications Glossary Awards Contacts

A Freight Analysis and Planning Model

1. Introduction

1.1 Introduction

The purpose of this research project is to link a newly developed method for estimating intra-metropolitan freight flows with widely used transportation planning software, and to demonstrate the model for regional transportation planning.

This project builds on National Science Foundation (NSF) research that developed a method for estimating freight flows using automated computational workflows, a computer science technique for automatically querying data and conducting computations to to produce freight flows from multiple data sources. The NSF funded work resulted in a working model, "Argos," that was applied using data for the Los Angeles region (Ambite and Kapoor, 2007a; Giuliano, Gordon, Pan, Park and Wang, 2008). In contrast to traditional methods of preparing model input data, Argos, automatically processes data from many different sources, greatly reducing the time a effort required for data preparation.

This research links the Argos workflow with TransCAD, a widely used commercial software package for metropolitan transportation planning. Linking with TransCAD allows practitioners easy access to Argos. The research also includes updating the data sources for the Argos model, generating an updated baseline, and using the updated modeling system for policy analysis.

This chapter provides a short overview of metropolitan freight modeling, a description of the Argos workflow, and summary results from the first application. Chapter 2 describes the process of linking Argos with TransCAD and validates the earlier results. Chapter 3 describes data sources and the updating process, and provides the updated baseline modeling results. Chapter 4 presents the policy analysis. Chapter 5 presents conclusions on the applicability of the modeling system for regional transportation planning.

1.2 Overview of Metropolitan Freight Modeling

A recent comprehensive review of freight modeling (Southworth, 2011) categorizes models of two main types: aggregate and disaggregate. Aggregate models estimate flows between geographic units; disaggregate models start with individual decision-makers and model the choices of mode, route, etc. Within the aggregate category, models are either vehicle trip based or economic activity based. Data availability continues to be a major constraint to model development, as most freight data are propriety and costly to collect.

Trip based models are the most widely used for metropolitan freight planning. They are a logical progression from the traditional 4-step urban transportation planning model. However, these models have some notable disadvantages: 1) they implicitly assume that the vehicle trip is the unit of demand, rather than the commodity being transported (Holguin-Veras and Zorrilla, 2006); hence they are not modeling the underlying economic supply and demand; 2) they require extensive, highly detailed and place specific data; 3) because the underlying economic dynamics are not considered, they require extensive calibration and are not transferable. Economic activity based commodity flow models are generally not used for metropolitan freight modeling because of the lack of available commodity flow data for small spatial units.

In earlier work we described an adequate urban transportation model as having the following attributes: (a) solid behavioral foundations, (b) multi-modal, (c) able to analyze interactions between passenger and freight, (d) able to take feedback from policy changes, (e) detailed enough to capture small-area impacts, (f) use widely available (non-proprietary) and frequently updated data. Building on initial work by Gordon and Pan (2001), Giuliano et al (2008) developed an economic activity approach that minimizes reliance on proprietary or individualized survey data. The base is an economic input-output model, and freight flows are generated from economic supply and demand. The major research steps involved are the following:

  1. Estimate commodity-specific interregional and international trip attractions and trip productions for those locations where airports, seaports, rail yards or regional highway entry-exit points are located.
  2. Utilize a regional input-output transactions table to estimate intraregional commodity-specific trip attractions and trip productions at the level of small-area units.
  3. Create regional commodity-specific origin-destination matrices using estimates from steps (1) and (2).
  4. Load the O-D matrices onto a regional highway network with known passenger flows.

We applied our model using 2001 data for the Los Angeles region, and our results are quite comparable to methods that rely on far more detailed data and model calibration (Giuliano et al, 2008). Because our approach uses widely available data sources and is economic activity based, it is transferable to other metropolitan areas. Variations of this model have been applied in research on Houston and Seattle by our research team. It is also scalable to higher levels, for example to statewide or region-wide planning and analysis.

1.3 The Argos Workflow

Commodity based models focus on supply and demand among economic sectors. The basic idea is to estimate inbound and outbound flows and then disaggregate the flows to an appropriate level of geography. Input/output tables are typically used to estimate the quantities of commodity supply and demand by geographic unit. Once allocated to origins and destinations, commodity flows are converted to truck trips. A truck origin-destination (O-D) matrix is then generated via some type of spatial interaction model (Holguin-Veras et al, 2001; Southworth, 2011). The focus on supply and demand helps to capture truck flows more accurately, and thus leads to more robust models (Wisetjindawat et al, 2006).

As noted above, data availability is a major problem in metropolitan freight flow modeling. Ideally, such modeling requires data on commodity flows by industry sector, mode, origin and destination, all at a geographic scale sufficiently fine to identify specific flows on specific routes at specific times. Data are also needed on exports and imports to the region, as well as through traffic. Such a comprehensive data source does not exist. Analysts must either collect the necessary data directly, or develop a method based on available data. Because of the costs of direct data collection, and the rapidly changing nature of economic activity, our approach is based on using available data sources.

However, using existing data sources generates other challenges. Some is not available at the appropriate geographic scale, there is more data on some flows than others, and industry data uses various classification systems, making different data sources incompatible. Using many different data sources generates the need for an efficient way to combine the data and manipulate the various sources.

1.3.1 General Approach

Our approach is illustrated in Figure 1-1, using the Los Angeles region as a case study (see Giuliano et al, 2008 for a detailed description). The first row of boxes in the flow chart show the various data sources used. These are described in Table 1. For interregional flows (imports/exports in Figure 1-1), we use a series of data sources to generate trip attractions and productions for the major import/export nodes. Regional input/output data and small area employment data are the basis for generating intraregional trip attractions and productions. Commodity attractions and productions are the basis for generating a freight flow O-D matrix. Control totals are used as checks at various points in the process.

Figure 1-1: Overview of Argos Data Process Steps

Figure 1-1 presents an overview of the original Argos workflow. Click image for text description.

Our approach has some important advantages. First, the secondary data sources we use are widely available and regularly updated, hence the approach is easily transferable across metropolitan areas. Second, our approach avoids use of proprietary data (with two low-cost exceptions) and data obtainable only through metropolitan level surveys. Therefore data costs are low, relative to other more conventional approaches.

Table 1-1: Data Sources for the Los Angeles Application

Data Source

Code System



Commodity Flow Survey (CFS)



Provides commodity flows by 2 digit SCTG (Standard Classification of Transported Goods) sector for US regions, states, and MSAs. Flows in dollars, tonnage by mode. Level of detail varies by geographic unit. Based on sample of shipments; sample data available by 5 digit SCTG, zipcode origin and destination, tonnage, value, mode. CFS is conducted irregularly.

Source: Bureau of Transportation Statistics,




Provides county level input/output data by 509 IMPLAN sectors for US counties, county level inbound/outbound flows, state and national foreign imports/exports. Proprietary data source, updated annually.

Source: Minnesota IMPLAN Group, Inc.,




Provides monthly imports and exports by HS (Harmonized System) code for customs districts, by mode; also provides annual imports and exports by SITC (Standard International Trade Classification) for world ports. Proprietary data source, updated monthly/annually.

Source: WISERTrade,

Waterborne Commerce of the US (WCUS)



Provides annual foreign and domestic trade by WCUS sector for major US ports, in tonnage. Updated annually.





The Southern California Association of Government provides small area employment data by SIC (Standard Industrial Classification). The data are generated from state employment and tax records and are used in regional modeling and forecasting.

Source: Available by special request from Southern California Association of Governments

1.3.2 Computational Workflow: Argos Planner

Using widely available secondary data sources requires using multiple sources and developing methods to assure consistency across the sources. New computer science tools make it possible to generate the required computations efficiently. From the computer science perspective, the process of estimating freight flows is an instance of a scientific workflow (Taylor et al. 2006). A scientific workflow is an executable specification that describes a combination of data sources and algorithms that computes a desired dataset, such as the workflow of Figure 1-1.

There are many challenges in producing a data processing workflow such as the transportation model of Figure 1-1. Since the data comes from a variety of sources, it may be expressed in different schemas, formats, and units. Therefore, the workflow needs to include many operations that perform different types of data conversion, for example, to translate a given measurement into different units - from tons to dollars to jobs to ton-miles to container units to trucks to passenger-car-equivalents. Also frequent is the need to translate economic data described in one industry/ sector classification to another, for example, from the North American Industry Classification System (NAICS) to the Standard Classification of Transported Goods (SCTG), or from different versions of these classifications, for example, from NAICS 1997 to NAICS 2002.

There are many details that the abstract workflow of Figure 1-1 does not show. The detailed workflow that estimates the truck traffic due to freight movements contains over 50 data access and data processing operations. Gordon and Pan (2001) implemented this estimation model by a combination of manual steps and custom-designed programs Argos automatically generates such a data processing workflow in response to a user data request, including all the necessary data integration and translation operations.

Argos is a general approach to construct data processing workflows, where the data sources and data processing operations are represented as web services. These services consume and produce relational tables, and thus are able to represent general computations. We describe the input/output signature of each service as relational formulas in an expressive logic (PowerLoom) using terms from an ontology of the application domain. These logical descriptions allow for a precise understanding of the data and enable the Argos planner to automatically construct a computational workflow in response to a user data request.

The Argos planner not only selects the relevant sources and data processing operations, but can also automatically insert adaptor services to connect the input an output of existing services. We have developed a set of domain-independent adaptor services that correspond to relational algebra operations (selection, projection, join and union), as well as some domain-dependent ones, such as product classification conversions. Figure 1-2 illustrates the insertion of an adaptor service. Assume that service Sc requires as input employment data according to the NAICS industry classification, but there is no source that produces such data. However, the system knows of a source Sp that contains a conversion table from SIC to NAICS industry codes and of a source Se for employment data that uses the SIC classification. Then, the system will automatically insert a Product Conversion service that adapts the data produced by Se to the data required by Sc as show in Figure 1-2. A more detailed description of the techniques for automatic workflow generation in Argos is available in (Ambite and Kapoor 2007a; Ambite and Kapoor 2007b).

Figure 1-2: Example of Adaptor Service

Figure 1-2 shows one of the automatically generated adaptor services that the Argos planners introduces when composing the workflow to respond to a user data request. The figure shows a Product Conversion service from SIC to NAICS industry codes, applied to employment data.

1.3.3 From Argos Planner to Transportation Model

The output of the Argos planner includes the productions and attractions, in dollars and tons, for both intraregional and interregional flows. In the previous application, the translation of tons to passenger car equivalents (PCEs), the distribution of PCEs to produce the O-D matrix, and the assignment of trips to the highway network were conducted using separate models developed by Prof. Qisheng Pan (see Gordon and Pan, 2001; Giuliano et al, 2008 for details). The process is shown in Figure 1-3 below.

We start with the matrix of productions and attractions by commodity code, by ton. We assume that all intraregional trips are by truck. Interregional trips include imports to the region, exports from the region, and through traffic. We use various data sources to factor out the portions of interregional flows that move by water, air or rail. For imports and exports, each flow (trip) starts/ends at an external node (a port, railroad yard, airport, interstate highway) and ends/starts at an internal zone. These local collection or distribution segments are also assumed to be truck. Port imports provide an example of how the factoring works. About 40% of all imports (as measured in dollars) are consumed within the region, and therefore travel on truck from the ports to distribution/warehousing locations. Of the 60% of imports destined for locations outside the region, about 7% travels entirely by rail or rail/truck combination, 6% by water or water/truck combination, 20% by air or air/truck combination and the remainder travel by truck only[1].

In the earlier application, we conducted a two stage trip distribution. The first was the intraregional distribution. The intraregional distribution was used to allocate the interregional trips to traffic analysis zones (TAZs). That is, interregional trips were assigned in proportion to intraregional trip attractions.

The conversion from tons to PCEs can be done in a number of ways. In this case we start with a control total of daily truck trips. Using state level factors, we calculate shares of truck trips/day by commodity sector, which yields tons/truck by sector. This method implicitly accounts for empty truck trips. The conversion to PCEs is based on vehicle classification data.

The final step in the process is the traffic assignment. In our earlier work, we started with the equilibrium assignment of all passenger trips (data provided by the Southern California Association of Governments). We then assigned the truck PCEs to the equilibrium network.

Figure 1-3: Original Flowchart for Traffic Assignment

This figure shows a flowchart of the steps involved in generating the traffic assignment. The traffic assignment process is fully described in section 1.3.3 of the text.

1.4 Test Empirical Application

The first empirical application was based on 2001 data for the Los Angeles region. We were able to compare our results with heavy duty truck (HDT) screenline data for 2003 provided by SCAG. We generated two traffic assignments, one with PCEs based on proportion of trucks by number of axles in the region; the other with PCEs unique to each screenline as calculated by SCAG. Both were compared with actual screenline data. Figure 1-4 shows our results using the proportion based PCEs. The average difference is 36%, the weighted average difference is 20%, and the regression R2 is 0.80. These results are at least as good as SCAG's model results, and we did not use any data fitting techniques to adjust our model.

The next step in improving our model is to link the Argos planner to commercially available transportation planning software. We selected TransCAD for this purpose. TransCAD is a GIS based software with full transportation modeling capabilities. It can be used at varying levels of geography, and it has an easy to use graphical interface. In addition, it is capable of handling very large data sets, a requirment for application to the Los Angeles region. The following chapter describes the process of linking Argos to TransCAD.

Figure 1-4: Comparison of Results

This figure shows the plot of estimated (y axis) and actual (x axis) heavy-duty truck trips in PCE. It shows the regression line and give coefficient estimates: estimated HDT PCE = 0.9851(actual HDT PCE) + 27096.

Updated: 05/24/2013
HEP Home Planning Environment Real Estate
Federal Highway Administration | 1200 New Jersey Avenue, SE | Washington, DC 20590 | 202-366-4000