Skip to contentUnited States Department of Transportation - Federal Highway Administration FHWA Home
Research Home
Report
This report is an archived publication and may contain dated technical, contact, and link information
Publication Number: FHWA-RD-98-166
Date: July 1999

Guidebook on Methods to Estimate Non-Motorized Travel: Supporting Documentation

2.5 Discrete Choice Models

Demand Estimation

Descriptive Criteria: What is It?

Categories:

Box with an x inside Bicycle Box with an x inside Pedestrian Box with an x inside Facility-Level Box with an x inside Area-Level

Purpose:

A discrete choice model predicts a decision made by an individual (choice of mode, choice of route, etc.) as a function of any number of variables, including factors that describe a bicycle or pedestrian facility improvement or policy change. The model can be used to estimate the total number of people who change their behavior in response to an action. As a result, the change in both non-motorized and motorized trips and distance of travel can be estimated. The model can also be used to derive elasticities, i.e., the percent change in bicycle or pedestrian travel in response to a given change in any particular variable.

Structure:

For a general discussion of discrete choice modeling principles and methods, see Ben-Akiva and Lerman (1985) and Horowitz, Koppelman, and Lerman (1986). Key points are summarized here.

A discrete choice model is a mathematical function which predicts an individual's choice based on the utility or relative attractiveness of competing alternatives (for example, bike or drive). The logit function is a common mathematical form used in discrete choice modeling(1).

The model generally includes characteristics of the individual (e.g., age, gender, and income) and relative attributes of competing choices (e.g., cost and time of auto vs. bike travel). It also might include environmental factors, personal attitudes, or other factors which are thought to influence the choice in question. The model is developed from a data set containing individual trip decisions, characteristics of alternative choices for the trip, geographical characteristics, and characteristics of the individual.

A simple discrete choice model, for example, might be used to predict the probability of taking a trip by bicycle vs. by car, based on three factors:

1. Time difference between the two modes for the trip.

2. Whether the respondent is male or female.

3. Whether or not bicycle lanes are available.

The estimated coefficients (or weights for each factor) can be used to derive elasticities. Elasticities indicate the percent change in the variable being predicted (i.e., probability of choosing a mode) for a given change in one of the independent variables, holding the other variables constant. While transferring elasticities to realworld situations involves a number of assumptions, the elasticities may be used to estimate the change in users as a function of a given change in a facility or policy variable.

Ideally, instead of simply using elasticities, the model is applied to the entire affected population to estimate the total number of people who will change their behavior as a result of an improvement. To do this, an affected population must be defined. Examples of such a population might be residents in a census tract or transit users who access a particular transit station. The population must be defined in groups for which either an average value or a distribution is known for every variable in the model.

There are three alternative methods for aggregating results for the population (Horowitz, Koppelman, and Lerman, 1986):

1. The "naive" method. Average values are assumed for each variable except the one of interest. In the current example, an average trip time difference and an average gender value (such as 50 percent M/ 50 percent F) are used, and the probability of choosing the bicycle mode is compared with and without bike lanes. Significant errors may be introduced, however, by using single aggregate values for population variables.

2. The "market segmentation" method. The population is divided into groups (i.e., male vs. female and with different travel distances). For each group, a mode choice probability is estimated, multiplied by the total population of the groups, and summed across all groups. This is repeated with and without bicycle lanes, and the total numbers of bicycle trips for the two alternatives are then compared. This reduces, but does not eliminate, aggregation errors. The method is widely used in practice (see Wilbur Smith Associates, 1996).

3. The "sample enumeration" method. This method takes a random sample of the total population, estimates a mode choice probability for each person in the sample, and averages the sample probabilities to estimate a mode share for the entire population. This method is the most accurate of the three but is also the most difficult to apply.

Calibration/Validation Approach:

Discrete choice models developed from stated-preference surveys can be calibrated/validated using models developed from data on actual (revealed) behavior (see "Inputs/Data Needs").

Inputs/Data Needs:

Discrete choice models are developed from data sets containing individual trip decisions, including characteristics of the individual and of alternative choices for the trip. Two types of data, revealed-preference and stated-preference, may be used, as described below:

1. "Revealed-preference" data, or data on actual behavior. This may be collected from a travel survey, which determines characteristics of a trip (origin and destination, mode, travel time, etc.) as well as characteristics of the individual and other influencing factors. Observations on trip decisions by 1,000 to 3,000 people are required. (Horowitz et al., 1985).

For this type of data to be useful for predicting non-motorized travel, the data set must include the following:

  • Characteristics of the non-motorized mode alternative for each trip, such as time, cost, facility or environment factors of interest, etc., even if the trip was not taken by the non-motorized mode.

  • Enough observations of people taking non-motorized trips that this choice can be reasonably estimated from the other variables in the model.

    While travel survey data are routinely collected in many metropolitan areas, at least one of these criteria is usually not met. Therefore, use of revealed-preference data to predict non-motorized mode choice generally requires additional data collection efforts. Potential sources of both existing data and new data are discussed in the following sections.

    An additional limitation to the use of revealed-preference data to forecast bicycle or pedestrian travel is that it cannot predict the impact of non-motorized improvements that do not yet exist. For example, if an extensive network of bicycle paths is to be developed but bicycle paths do not yet exist in the area, no observations are available to predict the use of these facilities. In cases of hypothetical improvements, a second type of data must be collected:

    2. "Stated-preference" data. To collect this type of data, respondents are asked to identify the choices they would make under various scenarios. For example, different combinations of the relative trip time, cost, and presence of bike lanes would be presented, and for each combination, the respondent would choose whether to drive or bicycle.

    This method is capable of evaluating a wide range of factors that may or may not yet exist. However, it has at least three significant drawbacks:

  • First, respondents are frequently overly optimistic when responding to hypothetical questions (Hunt and Abraham, 1997). For example, asking people "if they would bicycle, given factor X" will significantly overestimate the actual number of people who will switch to bicycling if factor X is provided. This problem can largely be overcome through the design of survey questions that force people to make tradeoffs between attributes, and by relating their responses to similar tradeoffs to their actual behavior.

  • Second, respondents must imagine what their choices would be like rather than experiencing them directly, and they may not accurately be able to judge their response to a situation that they have not encountered. This is a particular problem for evaluating bicycle and pedestrian facilities for which qualities of the physical environment (pavement smoothness, traffic noise, etc.) may be significant factors. Visual simulation techniques such as those used by Wilbur Smith Associates (1996) can partially although not completely overcome this drawback.

  • Third, the range of factors to be evaluated must be kept simple and phrased in terms that people can conceptualize them. For example, people may be able to predict their choice to walk given the "presence or absence of a sidewalk," but not as a specific function of sidewalk design, street crossing types, and other factors that make up the pedestrian environment.

  • Finally, respondents may say what they think the interviewer wants to hear rather than expressing their true opinion. This problem may vary depending on the methodology used to implement the survey and the ways in which the survey questions are phrased.

Stated-preference surveys are discussed further under the entry on Preference Surveys.

Photo of a person surveying 2 bicyclists on the side of the road

Figure 2.5 In a stated-preference survey, respondents are asked to choose between alternatives with different attributes.

Potential Data Sources:

Mode choice models including bicycling and/or walking are usually developed directly from the results of special data collection efforts. The most common type is to conduct a stated-preference survey of users and potential users. Respondents are asked to choose between alternatives with different attributes. The results of these choices are then combined with information about the respondent, her/his current choice of the presently available alternatives, and other environmental factors, to develop a predictive model.

Household travel surveys are a potential existing source of revealed-preference data. These surveys are conducted routinely by many Metropolitan Planning Organizations (MPOs), although not all have included non-motorized travel in the past. Metropolitan travel surveys, however, generally suffer from two major limitations for developing discrete choice models of non-motorized travel:

1. Characteristics of non-motorized alternatives for each trip are generally not collected, so the effects of changing policies or improving facilities cannot be evaluated.

2. Most surveys do not include enough observations of non-motorized trips to develop predictive models for these modes.

It is possible that the first limitation could be overcome by collecting additional data describing existing bicycle/pedestrian facilities or environments, so that these factors can then be related to the locations of survey respondents and used as a predictor of travel decisions. However, the effort involved in collecting this data could be considerable.

The National Personal Travel Survey (NPTS) is another potential source of individual travel behavior data, although collection of local facility or environmental data for inclusion in a model based on this data source would again be difficult and the amount of geographic detail in the survey is limited.

Surveys of transit access mode are sometimes conducted by transit agencies and have also been used as a data source for predicting access mode choice (Wilbur Smith Associates, 1996; Loutzenheiser, 1997). An advantage of these surveys is that they contain a significant percentage of non-motorized trips and generally distinguish between walk and bicycle access. Transit access surveys may need to be supplemented with additional site-specific data collection on bicycle and/or pedestrian facility factors to evaluate the effects of these factors.

Models can also be developed from special revealed-preference data collection efforts, which relate information from counts and/or surveys of users to descriptors of the facilities or travel environment encountered (for a discussion, see Hunt and Abraham, 1997).

Computational Requirements:

Discrete choice models can be estimated using a desktop microcomputer with specialized software such as ALOGIT.

User Skill/Knowledge:

A knowledge of statistical analysis and discrete choice modeling techniques is required, in addition to familiarity with sources and methods of collecting survey data.

Assumptions:

Discrete choice models assume that choices made by individuals can be predicted based on a limited set of quantifiable factors and that people are essentially rational decision-makers who seek to make choices that maximize their utility. Furthermore, the relationship between the underlying factors and the probability of the individual choosing a particular alternative is assumed to bear a particular functional form (i.e., a logit function).

Facility Design Factors:

A range of facility design factors can be included in a discrete choice model. The inclusion of design factors is generally limited by:

1. For stated-preference surveys, the need to keep hypothetical alternatives simple and understandable to the respondent.

2. For revealed-preference surveys, the resources required to collect data describing existing facilities. Also, design factors are limited to those that currently exist in the realworld.

Output Types:

Possible outputs include:

  • The probability of an individual making a particular choice given particular levels of variables (such as availability of bicycle parking, presence of a sidewalk, etc.)
  • Elasticities indicating the percent change in the variable being predicted (i.e., probability of choosing a mode) for a given change in one of the independent variables, holding the other variables constant.
  • Total number/percent of people expected to change behavior, if results of the model are aggregated over a population.

Real-World Examples:

Work Trip Mode Choice:

Kocur, Hyman, and Aunet (1982) describe the development of work-trip mode choice models for the Wisconsin Department of Transportation (WisDOT). In the late 1970s and early 1980s, WisDOT developed a series of mode-choice models to consistently assess transportation policy issues across urban areas in the State. Work-trip logit mode choice models are developed for four sets of metropolitan areas in Wisconsin based on the results of stated and revealed-preference surveys. Bicycle and walk are included as separate mode choices. Bicycle

facility variables include distance to work, existence of a bike lane (yes or no), street surface (smooth or rough), and traffic (busy or quiet). Pedestrian facility variables include distance to work, presence of sidewalks, and season (summer or winter). The models are used to estimate the effects of various policies on mode split. Addition of marked bicycle lanes to all streets in the cities studied was estimated to increase total summertime bicycle trips by 39 percent. Allowing pavement to deteriorate from smooth to rough was estimated to reduce summertime bicycle work trips by 42 percent.

Transit Access Mode:

Discrete choice models have been developed in a number of areas to predict transit access mode. A study for the Chicago Regional Transit Authority (Wilbur Smith Associates, 1996) estimates the effects on transit mode choice access of various improvements to bicycle and pedestrian facilities in station areas, based on estimation of a discrete mode choice model from both revealed-preference and stated-preference survey data. For more information, see separate entry on "Discrete Choice Models - Transit Access."

Taylor and Mahmassani (1996) developed a discrete choice model based on a hypothetical-choice stated-preference survey to assess preferences for work trip mode choice (auto, park-and-ride, or bike-and-ride). Facility factors include on-street bicycle facility type, bicycle parking facility type, and access distance to transit. Only relative utilities are reported, and the model is not used to predict changes in total mode use as a result of facility changes.

Loutzenheiser (1997) developed a discrete choice model of transit mode choice access based on Bay Area Rapid Transit passenger surveys and station area characteristics. Urban design and station area characteristics were found to be secondary to individual characteristics in determining the choice to walk. (Station area variables include nearby arterials and freeways; grid pattern; population density; and type and mix of land uses. Descriptors were developed using GIS techniques.)

Inclusion of Attitudinal and Perception Factors:

Additional studies have focused on determining the effect of users' attitudes and perceptions on the choice to walk or bike. These studies have also included rudimentary variables describing the quality of bicycle and/or pedestrian facilities.

Katz (1996) modeled demand for commuter bicycle use in two steps: (1) the choice to participate (bicycle) is modeled, through factor analysis and logit regression, based on attitudes and personal characteristics; and (2) mode choice is modeled through discrete choice (logit) models which include attitudes, personal characteristics, and structural factors (cost, distance, etc.). Bicycle facility measures include bicycle cost, trip distance, availability of showers and parking at the trip end, and percent of trip on a bike path. Elasticities for the bicycle mode are -0.88 for trip distance, +0.58 for percent of trip on bike path, and +0.26 for car cost. Inclusion of attitudinal factors is found to significantly improve model fit. Data are based on telephone and in-person surveys and choice experiments. An extensive discussion and literature review of the behavior modeling issues and techniques relevant to bicycle travel modeling is also included.

Kitamura, Mokhtarian, and Laidet (1997) conducted stated-preference surveys to determine the relative influence of socioeconomic, attitudinal, and neighborhood characteristics on travel behavior. Discrete choice models were developed to predict mode choice and total number of trips by mode. Facility variables included presence of sidewalks and bike paths as well as perceptions of whether streets are pleasant for walking or bicycling.

Noland (1995) developed multinomial logit models which relate use of a mode to perceptions of risk and convenience of that mode (perceptions of cost, comfort, and relevant personal variables are also included). Risk and convenience perceptions were measured based on surveys of bicyclists and of the general population. Modes include auto, transit, bicycle, and walk. The model was used to evaluate the general effect of policy variables on mode split. Elasticities were developed with respect to bicycle convenience, comfort, parking availability, competency, and lack of shoulders, as well as auto cost, convenience, and comfort. Sample enumeration was used to predict future mode splits as a result of policy changes. "Short-run" and "long-run" elasticities and mode splits were developed, which assume that many people do not have a choice of modes in the short run, but that in the long run different urban form policies and residential location decisions could allow everyone a choice of modes.

Mode Choice in Travel Demand Models:

Discrete choice models have been widely used to predict mode choice for work trips and other types of trips in the development of regional travel models. However, these models rarely include bicycle or walking trips as separate modes. For a discussion of models that do include bicycle and walking trips, see entry for "Regional Travel Models."

Route Choice Models:

Discrete choice models have also been applied to predicting route choice or facility preference as a function of route/facility characteristics (see "Discrete Choice Models - Route Choice," Method 2.6).

Publications:

Ben-Akiva, M. and S.R. Lerman. Discrete Choice Analysis: Theory and Application to Travel Demand. Cambridge, MA: The MIT Press, 1985.

Horowitz, Joel L.; Frank S. Koppelman and Steven R. Lerman. A Self-instructing Course in Disaggregate Mode Choice Modeling. Prepared for the Urban Mass Transit Administration (now Federal Transit Administration), Washington, DC, December 1986.

Loutzenheiser, David R. Pedestrian Access to Transit: A Model of Walk Trips and their Design and Urban Form Determinants Around BART Stations. Transportation Research Board, 76th Annual Meeting, Washington, DC, January 1997.

Katz, Rod. Demand for Bicycle Use: A Behavioural Framework and Empirical Analysis for Urban NSW, Doctoral Thesis, The Graduate School of Business, The University of Sydney, Sydney, NSW, Australia, December 1996.

Kitamura, Ryuichi; Patricia L. Mokhtarian and Laura Laidet. A Micro-Analysis of Land Use and Travel in Five Neighborhoods in the San Francisco Bay Area. Transportation Vol. 24, No. 2, May 1997.

Kocur, George; William Hyman and Bruce Aunet. Wisconsin Work Mode-Choice Models Based on Functional Measurement and Disaggregate Behavioral Data. Transportation Research Record 895, 1982.

Noland, Robert B. and Howard Kunreuther. Short-Run and Long-Run Policies for Increasing Bicycle Transportation for Daily Commuter Trips. Transport Policy, Vol. 2, No. 1, 1995.

Taylor, Dean and Hani Mahmassani. Analysis of Stated-Preferences for Intermodal Bicycle-Transit Facilities. Transportation Research Record No. 1556, 1996.

Wilbur Smith Associates. Non-Motorized Access to Transit: Final Report. Prepared for Regional Transportation Authority, Chicago, IL, July 1996.

Evaluative Criteria: How Does It Work?

Performance:

Kocur, Hyman, and Aunet (1982), in calibrating their behavior models based on actual behavior, found that the calibration coefficients are "larger than we would ideally like to see, but they indicate a relatively good correspondence between the experimental models and actual behavior."

The performance of the other models discussed here has not been evaluated.

Use of Existing Resources:

Development of a discrete choice model usually requires new data collection efforts. In some cases, it may be possible to transfer coefficients from a model developed in one area to other areas, eliminating the need for local data collection. However, this implies that the two situations are similar with respect to factors not included in the model.

Travel Demand Model Integration:

Discrete choice models are widely used to predict mode choice in existing travel demand models. It is a logical extension of existing practices to include non-motorized travel in this step. The added complication and data requirements, however, have so far limited the inclusion of non-motorized travel in most models.

Applicability to Diverse Conditions:

Determining the variables to include in a model and the required data collection efforts represents a tradeoff. The more specific the variables to the improvement being analyzed, the more accurate the results in analyzing that improvement. On the other hand, the model will be less applicable in different situations, and if a different improvement is to be analyzed, new data collection and modeling efforts may be required. Models with general environment or facility descriptors may have broader applicability but will be less suited for analyzing the impacts of a particular improvement. As an example, a model of bicycle choice may be estimated regionally using a variable of "miles of bicycle lanes available." Such a model may be of general use for evaluating and comparing facility improvement policies. For evaluating the effects of a specific improvement, however, it may not be as accurate as a model based on a survey in one locality which includes as a variable "bike lanes from point A to point B on street X."

Kocur, Hyman, and Aunet compared coefficients among the four sets of cities of varying sizes for which they were developed. They found that "most of the coefficients show relatively little variation across cities, which suggests that transferability of these coefficients among urban areas is a possibility."

Usage in Decision-Making:

No information is available.

Ability to Incorporate Changes:

See "Applicability to Diverse Conditions" above.

Ease-of-Use:

A knowledge of discrete choice modeling techniques is required, in addition to familiarity with sources and methods of collecting survey data.1The logit function is an "S-shaped" function relating one or more independent variables, such as the difference in auto and bicycle travel times, to the probability of making a specific choice, such as choosing to bicycle.

 

FHWA-RD-98-166

Previous | Table of Contents | Next

ResearchFHWA
FHWA
United States Department of Transportation - Federal Highway Administration