Office of Planning, Environment, & Realty (HEP)
Penelope Weinberger, AASHTO, Pweinberger@aashto.org
CTPP Oversight Board Meeting
On April 7th 2009, the CTPP Oversight Board met by conference call. Subjects included 1) the CTPP Work Program, 2) the NCHRP project, "Identifying Credible Alternatives for Producing 5-year CTPP Data Products from the ACS," 3) CTPP 3-year data product, and 4) Census Bureau Federal Register notice on ACS 5-year data products. Additional details on items 3 and 4 follow. The Board is scheduled to meet again in August 2009.
Request for CTPP 3-year ACS data products.
As covered in the December 2008 issue of the CTPP Status Report (http://www.fhwa.dot.gov/planning/census_issues/ctpp/status_report/), the Census Bureau upheld the Disclosure Review Board (DRB) ruling on thresholds for the CTPP 3-year data products (submitted to the Census Bureau in February 2008). A task force was reconvened to revise the CTPP request given the DRB rules. Members included state DOT representatives from New York, California, Florida, Georgia, Virginia and Minnesota and MPO representatives from Atlanta, Washington, DC, and Hampton Roads, Virginia. Steve Polzin and Alan Pisarski also provided valuable input.
Task force members were unanimous that there is a continuing need for 3-year CTPP special tabulations and that 3-year county-to-county flow data are priorities. The task force together with Melissa Chiu of the Census Bureau developed a revised table list to meet the DRB requirements. Our new request, submitted to the Census Bureau on April 1st, 2009, limits the cross-tabulations with Means of Transportation to five variables deemed to be most important for planning purposes. They are:
We also asked for 'length of time in US' to be used as an alternate variable, should one of the other variables fail. The DRB was firm about limiting cross-tabs to five variables. In addition to paring our request down from 18 to five variables, we included a table collapsing schema that was calculated to give us data for a majority of geographies. We are currently awaiting a decision from DRB, a cost estimate from CB and, if all goes as planned, the tables themselves by Spring/Summer 2010.
Federal Register Notice Regarding ACS 5-year data products
On March 6th, 2009 the Department of Commerce placed a Federal Register Notice and Request for Comment regarding the Census Bureau's Proposal for ACS 5-year Data Products. AASHTO intends to respond to the Notice by the April 20th deadline with a letter summarizing comments solicited from the Standing Committee on Planning. The draft response identifies the following limitations in the ACS. The proposed data products would:
Every state has participated in either the CTPP Consolidated Purchase, or the FHWA Pooled Fund for CTPP. The Consolidated Purchase constitutes approximately $3.9M and will be used to fund Census data products, training and capacity building, research and oversight.
CTPP Profiles using 2005-2007 ACS
These profiles include data from both the 2005-2007 American Community Survey (ACS) and the Census 2000. The profiles are designed to give transportation planners a handy way to examine trends by including two time points. The profiles are available only for those areas meeting a 20,000 population threshold. There are 5 profiles anticipated.
The profiles will be posted at the AASHTO page: http://ctpp.transportation.org/
AASHTO has two new staff of interest to the CTPP community. Penelope Weinberger is the CTPP Program Manager. Penelope spent the previous five years working for the FHWA Travel Model Improvement Program (TMIP) and is happy to bring her outreach skills and organizational ability to this side of the data equation. Before TMIP, Penelope was a consultant with Cambridge Systematics. Also new on board is Michelle Maggiore, P.E., AASHTO Program Director for Policy and Planning. Michelle brings more than 10 years of transportation planning and policy experience to her position and has a keen appreciation of the issues surrounding data for transportation planning. Penelope can be reached at email@example.com or 202-624-3556.
Elaine Murakami, FHWA Office of Planning Elaine.firstname.lastname@example.org
About one-quarter of the households in the United States can be characterized as a household of color. Hispanic (all races) households account for 11 percent of households, and African American households account for nearly 12 percent of all households. This article is limited to four categories of race and Hispanic origin:
Since some households do not fall into any of these categories, the total includes households which do not fall into any of these four groups. Since some African Americans are also Hispanic, these numbers reflect some double counting. (U.S. Census Bureau).
|Households, 2007||% of total hhlds
|Hispanic (all races)||12,311,308||11.0|
Source: 2007 ACS, Tables B25003, and B25003A through B25003I
Race and Hispanic origin have been important variables distinguishing commute travel behavior, and this article examines what changes have occurred between 2000 and 2007 (Battelle, 2000 and Murakami, 2003). This is merely an initial glance. Further analyses should include additional variables such as gender, age, household income, household size, and neighborhood characteristics such as population density.
While gasoline prices went as high as an average of $3.15 per gallon in November 2007, the impact of highly variable gasoline prices and the economic downturn resulting in job losses in 2008 are not reflected in these results.
Nationwide, in 2007, about 9 percent of households do not have any vehicle. The proportion of households without any vehicle has continued to decline. Between 2000 and 2007, the proportion declined another 0.5 to 1.0%, given the change in survey methods in the ACS and the decennial census "long form" (see Table 2.) African-American and Hispanic households are still more likely to be without a vehicle than White, non-Hispanic households, but the gap is closing.
As Figure 1 shows, for Hispanic households, the difference with White, non-Hispanic households is closing rapidly. In 1980, nearly 22 percent of Hispanic households had no vehicle, compared to 13 percent for the total population, or difference of 9 percent. By 2007, the difference was reduced to 4 percent (13 percent of Hispanic households and 9 percent for the total population).
The proportion of African American households without vehicles continues to be double that of White, non-Hispanic households. In 1980, over 32 percent of African American households had no vehicle, compared to 13 percent for the total population. In 2007, nearly 20 percent of African American household had no vehicle, compared to 9 percent for the total population.
|Average vehicles per household||1.69||1.73||1.77|
Note: 2000 ACS C2SS is the Census 2000 Supplemental Survey. Documentation can be found http://factfinder.census.gov/servlet/MetadataBrowserServlet?type=surveyInstance&id=2000+Supplementary+Survey&survey=Decennial+Supplementary+Survey&_lang=en.
Mode to work
Driving alone remains, by far, the most popular mode to work for all groups. Over 75 percent of all workers said that they usually drove alone to work in 2007. Between 2000 and 2007, nationwide:
|Mode to Work|
|2000 ACS: C2SS||2005 ACS||2007 ACS|
|Total Workers (in millions)||115.1||128.3||127.7||133.1||139.3|
|Work at Home||3.0%||3.3%||3.2%||3.6%||4.1%|
Source: CTPP Status report December 2008 http:www.fhwa.dot.gov/planning/census_issues/ctpp/status_report/sr1208.cfm
For White, non-Hispanic workers, nearly 80 percent of workers usually drive alone. The proportion of workers driving alone is between 65 and 71 percent for the other groups.
In 2007, Hispanic workers are twice as likely to use carpooling to work (over 17 percent) than White, non-Hispanic, workers (9 percent). All people of color, African Americans, Hispanic and Asian workers are much more likely to use transit to work than White, non-Hispanic workers. African American workers are four times more likely use transit (12 percent), compared to White, non-Hispanic workers (3 percent).
Carpooling declines in all groups
Across all races, there was a decline in carpooling between 2000 and 2007. Hispanic workers are the most likely to carpool, but the 2007 results reflect a dramatic decline in carpool share from 22.5 percent in 2000 to 17.5 percent in 2007. African American workers also revealed a similar decline in carpooling, from 16.0 percent in 2000 to 10.4 percent in 2007. As some of the difference may be due to changes in methodology between the ACS and the decennial census "long form," we can estimate that the decline in the carpool share is at least 4 percent for Hispanic workers, and at least 3 percent for African American workers.
Working at home increases in all groups
Across all races, working at home increased between 2000 and 2007. However, White, non-Hispanic and Asian workers had a larger increase in the share of workers who worked from home, or about 1 percent. African American and Hispanic workers on the other hand had a smaller increase in working at home, which could be within the margin of error, and reflect differences due to survey methodology.
|2007 ACS||Total||White, non-Hispanic||Black||Asian||Hispanic(all races)|
|Work at Home||4.1||4.7||2.2||3.4||2.5|
|2000 Census||Total||White, non-Hispanic||Black||Asian||Hispanic(all races)|
|Work at Home||3.3||3.8||1.5||2.4||1.8|
Figure 2. Mode to work by Race and Hispanic Origin, 2007, Percent of Workers
Battelle (2000). Travel Patterns by People of Color. Federal Highway Administration. http://www.fhwa.dot.gov/ohim/trvpatns.pdf (accessed December 30, 2008)
Murakami, E. "Households without Vehicles, 2000" in CTPP Status Report, January 2003. http://www.fhwa.dot.gov/planning/census_issues/ctpp/status_report/sr0103.cfm (accessed January 6, 2008)
U.S. Census Bureau, Racial and Ethnic Classifications Used in Census 2000 and Beyond, http://www.census.gov/population/www/socdemo/race/racefactcb.html (accessed December 30, 2008)
Laura McWethy, Cambridge Systematics Inc., email@example.com
Synthetic populations are used in many fields for various purposes, and they are emerging as an important aspect of the travel demand modeling and forecasting process. As the research focus shifts from aggregate to microsimulation travel demand models, it is necessary to generate a synthetic population to be used as the model inputs. Most applications of synthetic populations in the transportation field use the same method of generation, specifically utilizing Census data at multiple geographic levels and using iterative proportional fitting (IPF) to reconcile the distributions of household attributes at different locations.
While the general population synthesis procedure is well established, along with the method of IPF, data such as the CTPP2000 allows for more accurate reproductions of households by utilizing the multivariate distributions at a smaller geographic level, perhaps making IPF unnecessary. Because CTPP2000 data is not widely used in travel demand model population synthesizers, it is important to know just how accurate the iterative proportional fitting procedure is in replicating the probabilities of multivariate distributions. Comparing the actual multivariate (MV) distributions from the CTPP2000 data with a synthesized population created using IPF gives an estimate of the fit and accuracy of the IPF procedure itself. This article compares the results of an IPF procedure utilizing the Census 2000 Public Use Microdata Sample (PUMS) data for the year 2000 with the multivariate distribution table obtained from the CTPP2000 data to determine the accuracy of the IPF procedure. Based on the case study presented here, the IPF procedure is not an accurate method of generating synthetic households, and can be improved upon greatly through the use of CTPP data at the TAZ level.
General Population Synthesis Method and Iterative Proportional Fitting
In very general terms, the population synthesis procedure involves computing a multivariate distribution of variables table for the desired geographic area and then drawing sample households from a data set containing detailed records to match these distributions. The proportions tables can be calculated in several ways, but the most commonly used method is IPF.
IPF involves two sets of data; larger scale data containing multivariate distributions of variables, and smaller scale data containing marginal control values for each variable. A multivariate table is typically generated from the detailed PUMS data at the Public Use Microdata Area (PUMA) level for the desired number of variables (typically two or three) but only for the PUMA. Analysts seek a detailed table for the smaller geographic areas, and so IPF is employed to accomplish this goal. IPF is used to adjust the multivariate proportions in the table so that the marginal proportions are met at the smaller geographic levels, generally obtained from the Census Summary Tape File 3 (SFT3).
Synthetic Population Generation Application
To accomplish this analysis, a synthetic population was generated for Kent County, Michigan. This is an area comprised of four PUMAs and contains 565 Traffic Analysis Zones (TAZs). Kent County contains approximately 497,000 individuals, with an average household size of 2.4 persons. The two control variables looked at in this case study are household income in 4 categories (< $30k, $30k-$74,999, $75k-$149,999, ≥ $125k) and household size in 4 categories (1 person, 2 persons, 3 persons, and 4+ persons).
The IPF procedure was undertaken for all 565 TAZs using the entire county's MV distribution as the base seed and the marginal values for each TAZ calculated from the CTPP data. A maximum iterations criteria of 100 and a closure criteria of 0.001 difference from the marginal distributions were used in the procedure, which are both greater than typically used values of 50 iterations and 0.01 difference criterion. Tables 1 and 2 show the maximum difference between the IPF result and the actual MV distribution for each cell over all 565 TAZs, as well as the average difference between the IPF result and the actual MV distribution. Seven of the sixteen cell values have a maximum difference of over 90% probability. This initial look indicates that there are significant errors involved in the IPF procedure.
A Chi-square test helps quantify the accuracy of the IPF procedure. For a 10% confidence interval, 151 of the 565 TAZs (26.7%) failed the Chi-square test. This indicates that important errors can emerge from the equal correlation assumption between the geography levels. This error could potentially be less in an actual population synthesis procedure using the Census data, as it would have included three seeds in this test population (one for each PUMA), instead of one seed for the whole county.
|Income 1||Income 2||Income 3||Income 4|
|HH Size 1||0.999||0.863||0.250||0.137|
|HH Size 2||0.550||0.640||0.997||0.951|
|HH Size 3||0.548||0.890||0.997||0.499|
|HH Size 4||0.999||1.000||0.901||0.842|
|Income 1||Income 2||Income 3||Income 4|
|HH Size 1||0.108||0.056||0.010||0.004|
|HH Size 2||0.055||0.086||0.060||0.032|
|HH Size 3||0.022||0.075||0.053||0.021|
|HH Size 4||0.030||0.112||0.088||0.047|
ConclusionAs shown by the case study, IPF is not necessarily the accurate procedure it is typically accepted to be. Generation of synthetic population procedures should acknowledge the error introduced by the equal correlation assumption over the two geographic levels. Most current population generators do not include validation statistics over any stage of the process and it is important to understand the variance inherent in the population, both in the controlled and uncontrolled variables. Utilizing the CTPP2000 data is an alternate method that ensures accuracy in the controlled variables, and eliminates error due to the IPF procedure. While it is not possible to reduce the error of the uncontrolled variables, it is possible to quantify the variance by multiple simulations, which is a validation step not undertaken enough in current procedures. While generating synthetic populations is a well-established process, there are many areas for improvement in the current practice, especially through the explicit validation procedures discussed here. This research was done in conjunction with a Master's degree program in Transportation Engineering at the University of Texas at Austin, completed in December 2006.
Elaine Murakami, FHWA Office of Planning Elaine.firstname.lastname@example.org
Given the restrictions of the Census Bureau's Disclosure Review Board on the CTPP production, it behooves us to learn to use other Census resources. This article is a brief introduction to using Public Use Microdata Samples (PUMS) from the decennial Censuses and the American Community Survey (ACS).
The Census Bureau has prepared an introductory powerpoint presentation on PUMS http://www.census.gov/programs-surveys/acs/guidance/training-presentations.html
In addition, a recording of the web-training session conducted by Katie Genadek of the University of Minnesota held on April 16 is available at http://fhwa.na3.acrobat.com/p25106595/
What is PUMS?
The Public Use Microdata Sample, or PUMS, is a sample of population and housing unit records. The Census Bureau has released PUMS from decennial censuses, and now has created similar PUMS files for the ACS. The PUMS files include the actual responses from the ACS questionnaire for each person and household. Of course some responses have been edited to protect the confidentiality of the respondents. As an added protection, the geographic detail of residential location is limited to a Public User Microdata Area (PUMA). Another protection is that only some of the ACS responses are included in the PUMS.
Where are PUMS Data?
PUMS can be accessed using IPUMS from the University of Minnesota. IPUMS-USA (http://usa.ipums.org/usa/) is a web-based project dedicated to collecting and distributing United States census data. It currently includes samples from 15 federal censuses and from the American Community Surveys of 2000-2007. These samples draw on every surviving census from 1850-2000, and the 2000-2007 ACS samples.
IPUMS is not pre-tabulated data like the decennial Census Summary Tape Files or Summary Files. Instead, it is a sample of the microdata records. Each record is a person, with all characteristics numerically coded. This means that the data user can select from all the available variables to use in their own analysis, rather than restricted to pre-established 2- or 3-way cross-tabulations.
In most samples persons are organized into households, making it possible to study the characteristics of people in the context of their families or other co-residents. IPUMS-USA is currently funded through 2012 by several grants from the National Institute of Child Health and Human Development (Ruggles et al, 2008).
The IPUMS site uses SDA to allow users to process the data directly. That is, data analysts can run the microdata without having stand-alone statistical software on their own PC. SDAis a set of programs for the documentation and Web-based analysis of survey data. There are also procedures for creating customized subsets of datasets. This set of programs is developed and maintained by the Computer-assisted Survey Methods Program (CSM) at the University of California, Berkeley.
With SDA, you can recode variables into groups (for example: ages, household income, travel time to work), and can prepare simple cross-tabulations, or run regressions or logit models.
Here is a simple example of Means of Transportation to Work cross-tabulated by Type of Housing (that is, whether or not it was a group quarters sample). Nathan Erlbaum from New York State DOT was interested in the impact of group quarters population on bicycling to work numbers, which led to this cross tabulation.
|Row||tranwork||Means of transportation to work||0-70|
|Column||gq||Group quarters status||1-5|
|Filter||tranwork(10-70)||Means of transportation to work||0-70|
Households under 1970 definition
Additional households under 1990 definition
Other group quarters
Additional households under 2000 definition
|tranwork||10: Auto, truck, or van||88.4|
|31: Bus or trolley bus||2.5|
|32: Streetcar or trolley car||.1|
|33: Subway or elevated||1.5|
|50: Walked only||2.5|
|70: Worked at home||3.2|
An alternative to using IPUMS is to use the Census Bureau's DataFerrett, as documented in this handbook: "What Public Use Microdata Sample (PUMS) Data Users Need to Know" http://www.census.gov/programs-surveys/acs/guidance/handbooks.html/#pumshttp://www.census.gov/programs-surveys/acs/Downloads/ACS/Accuracy00_C2SS.pdf
One can also download files from the Census Bureau PUMS site: http://www.census.gov/main/www/pums.html
What geographic areas are available for ACS PUMS?
The ACS PUMS files have a number of geographies created for use in the PUMS files. These include PUMAs, SuperPUMAs, POW-PUMAs, and MIG-PUMAS. PUMAs were last defined for the 2000 Decennial Census and have a minimum population of 100,000 residents. PUMAs are built from counties, and in densely populated areas and counties, they are built from incorporated places and census tracts, and, in the 6 New England states PUMAs, from MCDs (towns and cities). In more rural areas, it is likely that several counties have been grouped to make up one PUMA. The 2000 PUMAs were defined to have a relationship with the current (then 1999) metropolitan areas.
With the ACS, the PUMAs are now being used as a geographic tabulation unit for annual data and for 3-year ACS estimates. Since PUMAs have a population threshold of 100,000, they meet the 65,000 population threshold used in publication of annual ACS estimates. This means that one can use American FactFinder to request data for PUMAs.
For Census 2000 two levels of PUMA geography were defined- one represented a 5% sample of all the survey records and the other represented a 1% sample. Since the PUMA has to contain at least 100,000 to protect the residents confidentiality the 5% PUMAs are smaller in area. For ACS, the PUMAs available are the same as those used for the 5-Percent sample. The only exception is in Louisiana due to population displacement from hurricane Katrina, where 3 PUMAs were combined to meet confidentiality requirements.
SuperPUMAs are aggregates of PUMAs with a minimum population threshold of 400,000. SuperPUMAs cover the whole country and nest within states. They were designed to accommodate the 1% sample.
Place-of-Work PUMAs (POW-PUMAs) are modified PUMAs and SuperPUMAs that contain information on place of work. They are most often county based, but can also be defined to the place level, and, in the six New England states, can be MCD-based. They are not a strict geocoding of workplace to PUMA. An equivalency of PUMA to POW-PUMA can be found in Appendix N of the Census 2000 technical documentation: http://www.census.gov/prod/cen2000/doc/pums.pdf
More documentation on POW-PUMA codes is available at: http://factfinder.census.gov/home/en/acs_pums_2007_3yr.html and look under "Documentation" for Place of Work PUMA
Migration PUMAs (MIG-PUMAs) are similar to POW-PUMAs but they relate to place of residence information. MIG-PUMAs are based on counties and, in the six New England states, MCDs, but are not place-based.
Where are maps of PUMAs?
Figure 1 is an example of the maps showing PUMAs in Eastern Washington.
Where to get PUMA shapefiles?
The PUMA shapefiles are available from the Census Bureau TIGER/Line (T/L) products at: http://www.census.gov/geo/www/tiger/index.html. Since PUMAs nest within a state, to download the 2000 PUMA TIGER/Line shapefiles one must first select a state from which to download the PUMA file.
Figure 1 Example of PUMA Boundary
Steven Ruggles, Matthew Sobek, Trent Alexander, Catherine A. Fitch, Ronald Goeken, Patricia Kelly Hall, Miriam King, and Chad Ronnander. Integrated Public Use Microdata Series: Version 4.0 [Machine-readable database]. Minneapolis, MN: Minnesota Population Center [producer and distributor], 2008.
CTPP Hotline –; 202/366-5000
CTPP Listserv: http://www.chrispy.net/mailman/listinfo/ctpp-news
CTPP Website: www.fhwa.dot.gov/planning/census_issues/ctpp/
FHWA Website for Census issues: http://www.fhwa.dot.gov/planning/census_issues/
CTPP 2000 Profiles: http://ctpp.transportation.org
1990 and 2000 CTPP downloadable via Transtats: http://transtats.bts.gov/
TRB Subcommittee on census data: http://www.trbcensus.com
Mary Lynn Tischer, VA DOT
Jonette Kreideweis, MN DOT
Census Bureau: Housing and Household Economic Statistics Division
The CTPP Listserv serves as a web-forum for posting questions, and sharing information on Census and ACS. Currently, over 700 users are subscribed to the listserv. To subscribe, please register by completing a form posted at: http://www.chrispy.net/mailman/listinfo/ctpp-news
On the form, you can indicate if you want e-mails to be batched in a daily digest. The website also includes an archive of past e-mails posted to the listserv.