REPORT

This report is an archived publication and may contain dated technical, contact, and link information

Top
< Prev
Main
2
3
4
5
6
7
8
9
10
11
Next >
>>

Publication Number: FHWA-HRT-16-055 Date: January 2016

Publication Number: FHWA-HRT-16-055
Date: January 2016

User-Friendly Traffic Incident Management (TIM) Program Benefit-Cost Estimation Tool

APPENDIX A: AN EXAMPLE OF REGRESSION DEVELOPMENT PROCESS

In this section, the regression model development process is illustrated by considering the travel delay of cars. Results of statistical analysis for the model are shown in table 13. Fit diagnostics for developed models, including residual graphs for each explanatory variable, were computed and analyzed. Additional steps to fit a nonlinear regression model to the data are presented. Similar steps were taken for development of the travel delay of trucks and fuel consumption models for cars, details of which are omitted for brevity.

Table 13. Linear regression model for travel delay of light-duty vehicles (cars).

ResultsModel: Linear_Regression_Model Dependent Variable: TotalDelayOfCar(hours)
Number of Observations Read	1320
Number of Observations Used	1320

Analysis of Variance
Source	DF	Sum of Squares	Mean Square	F Value	Pr > F
Model	6	16431362977	2738560496	348.29	0.0000
Error	1313	10324098945	7862984.726
Corrected Total	1319	26755461923

	Root MSE	2804.10141	R-Square	0.6141
	Dependent Mean	3385.04141	Adj R-Sq	0.6124
	Coeff Var	82.83802

Parameter Estimates
Variable	DF	Parameter Estimate	Standard Error	t Value	Pr > \|t\|
Intercept	1	-4397.909706	520.739103	-8.45	0.0000
NofLaneIndex1	1	-36.5690578	3.88923507	-9.40	0.0000
Duration(hours)	1	1960.455234	84.50397765	23.20	0.0000
FFS(km/h)	1	19.40740278	3.930616261	4.94	0.0000
COMPTP(*10k)	1	13.49217169	14.82628496	0.91	0.3630
Volume(k)	1	4636.452108	123.6968344	37.48	0.0000
Gradient(*10k)	1	182.4028943	27.00029045	6.76	0.0000

Covariance of Estimates
Variable	Intercept	NofLaneIndex1	Duration(hours)	FFS(km/h)	COMPTP(*10k)	Volume(k)	Gradient(*10k)
Intercept	271169.2134	-1146.180224	-9689.455221	-1341.431227	-1923.665266	-18407.66357	-3972.210894
NofLaneIndex1	-1146.180224	15.12614943	15.51532574	0.730591989	-0.624425891	20.24586907	0.983763344
Duration(hours)	-9689.455221	15.51532574	7140.922238	-10.98496648	-24.13639598	578.5220012	19.08887543
FFS(km/h)	-1341.431227	0.730591989	-10.98496648	15.4497442	-1.276124235	-19.22720807	-0.119290199
COMPTP(*10k)	-1923.665266	-0.624425891	-24.13639598	-1.276124235	219.8187258	91.17392218	7.180905576
Volume(k)	-18407.66357	20.24586907	578.5220012	-19.22720807	91.17392218	15300.90684	73.58130849
Gradient(*10k)	-3972.210894	0.983763344	19.08887543	-0.119290199	7.180905576	73.58130849	729.0156843

Correlation of Estimates
Variable	Intercept	NofLaneIndex1	Duration(hours)	FFS(km/h)	COMPTP(*10k)	Volume(k)	Gradient(*10k)
Intercept	1.0000	-0.5659	-0.2202	-0.6554	-0.2492	-0.2858	-0.2825
NofLaneIndex1	-0.5659	1.0000	0.0472	0.0478	-0.0108	0.0421	0.0094
Duration(hours)	-0.2202	0.0472	1.0000	-0.0331	-0.0193	0.0553	0.0084
FFS(km/h)	-0.6554	0.0478	-0.0331	1.0000	-0.0219	-0.0395	-0.0011
COMPTP(*10k)	-0.2492	-0.0108	-0.0193	-0.0219	1.0000	0.0497	0.0179
Volume(k)	-0.2858	0.0421	0.0553	-0.0395	0.0497	1.0000	0.0220
Gradient(*10k)	-0.2825	0.0094	0.0084	-0.0011	0.0179	0.0220	1.0000

The developed regression models are based on four assumptions related to the dependent variables: independence, normality, homoscedasticity (constant variance of response variable), and linearity. The regression assumptions can be reexpressed in terms of modeling errors to validate the assumptions on which the model is built. Where random errors are independent, normally distributed, have constant variance σ² and zero mean, they can be considered as a random sample from N (0, σ²). In addition, the best representation of errors is through standard residuals. SAS calculates residuals with a variance of 1. A summary of goodness-of-fit test results for travel delay of light-duty vehicles is presented in figure 19. Analysis of each test is further discussed separately. Behavior of other regression models and the analysis were very similar for this case.

Figure 19. Chart. Fit diagnostics for total travel delay of light-duty vehicles.

In general, any systematic pattern in residuals indicates a violation in assumptions and systematic error (figure 19). In this model, it appears that the linearity assumption is violated because the residuals are not scattered randomly around zero and do not form a clear pattern. Also, the variance of residuals seems to have two values and that value is not constant.

It shows that accuracy of the model decreases as TDc increases. This problem is known as heteroscedasticity.

Figure 20. Chart. Plot of residuals for total travel delay of light-duty vehicles. Figure shows the distribution of residual delays for total hours of car delays for light-duty vehicles. The distribution begins at -9000 hours, peaks at -1000 hours then decreases to zero at roughly 9000 hours.

Figure 20. Chart. Plot of residuals for total travel delay of light-duty vehicles.

Figure 21. Chart. Plot of R-student residuals for total travel delay of light-duty vehicles. Figure shows the Rstudent residuals on the vertical axis over predicted value on the horizontal axis for total travel delay of light duty vehicles. The results are clustered between +2 and -2 on the vertical axis and 2000 and 8000 on the horizontal axis.

Figure 21. Chart. Plot of R-student residuals for total travel delay of light-duty vehicles.

Looking at the Quantile-Quantile plot (figure 21) the slope of the curve of the plotted points increases from left to right, which indicates that a theoretical distribution skewed to the right, such as a log-normal distribution, might better fit the data. In addition, the mild curve indicates a small shape parameter for the chosen distribution (i.e. σ for log-normal). Cook’s Distance (figure 23) shows outlier points, as all data points are not within a distance of two units of residual of the zero line. However, since the data result from designed experiments, we cannot eliminate the outliers with this method.

Figure 22. Chart. Quantile-Quantile plot for total travel delay of light-duty vehicles. Figure 22 shows the Q-Q plot of residuals for total car delays in hours on the vertical axis over quantiles on the horizontal axis. The plot picks up at roughly -7000 at the -2 quantile and tracks evenly to roughly 3000 and the +2 quantile.

Figure 22. Chart. Quantile-Quantile plot for total travel delay of light-duty vehicles.

Figure 23. Chart. Outlier and leverage diagnostics for total travel delay of light-duty vehicles.

As part of additional analysis, the residuals are plotted separately for each explanatory variable (figure 22). Since the variables are uncorrelated by design, each graph shows the direct relationship of the dependent variable and the explanatory variable. Travel delays of light-duty vehicles seem to have a nonlinear relationship with a number of available lanes. The residuals suggest data-fitting functions, such as log-normal distributions. Incident duration has a random scatter plot suggesting a quadratic relationship between incident duration and travel delay of cars. Also, variance is not constant and there is fanning.

Residuals of volume show cosine or bimodal distribution. Form Residuals associated with the FFS, truck composition, and gradient are also randomly scattered around zero; therefore, the linear assumption seems reasonable.

Figure 24. Chart. Scatterplots of residuals against explanatory variables.

Given these observations, to improve the model, new variables based on the above analysis were introduced to the model and the process was continued. These variables were developed from a variety of transformations involving the explanatory variables.

For travel delay of light-duty vehicles, residual graphs for the final fitted model were found, as seen in figure 25 and figure 26. Residuals are distributed normally around zero (figure 25) and systematic patterns of these models are eliminated (figure 26).

Figure 25. Chart. Normality of residuals for total delay of cars. Figure shows the distribution of residuals for log of total travel delays by car. The distribution is normal, beginning at approximately -2.5, peaking at 30% at 0 residuals, then decreasing to 0 percent again at approximately 2.5 residuals.

Figure 25. Chart. Normality of residuals for total delay of cars.

Figure 26. Chart. Standard residuals for total delay of cars. Figure shows the distribution of standard residuals for the total delay of cars. The chart plots Rstudent on the vertical axis over predicted value. The distribution is concentrated between RStudent of -2.1 and +2.1 and predicted value of 0 and 10.0.

Figure 26. Chart. Standard residuals for total delay of cars.

Page Owner: Office of Research, Development, and Technology, Office of Operations, RDT

Topics: research, operations, intelligent transportation systems, ITS
Keywords: research, operations, intelligent transportation systems, ITS, research, safety, traffic incident management, safety service patrol, benefit cost analysis
TRT Terms: research, Communication and control, Telematics, Intelligent transportation systems
Scheduled Update: Archive - No Update needed

This page last modified on 01/31/2017