U.S. Department of Transportation
Federal Highway Administration
1200 New Jersey Avenue, SE
Washington, DC 20590
202-366-4000


Skip to content
Facebook iconYouTube iconTwitter iconFlickr iconLinkedInInstagram

Federal Highway Administration Research and Technology
Coordinating, Developing, and Delivering Highway Transportation Innovations

Report
This report is an archived publication and may contain dated technical, contact, and link information
Publication Number: FHWA-RD-03-060

Concrete Mixture Optimization Using Statistical Methods

Previous | Table of Contents | Next

CHAPTER 2. Background on Statistical Methods

2.1 Response Surface Methodology

Response surface methodology (RSM) consists of a set of statistical methods that can be used to develop improve, or optimize products [3]. RSM typically is used in situations where several factors (in the case of concrete, the proportions of individual component materials) influence one or more performance characteristics, or responses (the fresh and hardened properties of the concrete). RSM may be used to optimize one or more responses (e.g., maximize strength, minimize chloride penetration), or to meet a given set of specifications (e.g., a minimum strength specification or an allowable range of slump values). There are three general steps that comprise RSM: experiment design, modeling, and optimization. Each of these is described below.

Concrete is a mixture of several components. Water, portland cement, and fine and coarse aggregates form a basic concrete mixture. Various chemical and mineral admixtures, as well as other materials such as fibers, also may be added. For a given set of materials, the proportions of the components directly influence the properties of the concrete mixture, both fresh and hardened.

2.2 Experiment Design

Consider a concrete mixture consisting of q component materials (where q is the number of component materials). Two experiment design approaches can be applied to concrete mixture optimization: the classic mixture approach, in which the q mixture components are the variables, [7] and the mathematically independent variable (MIV) approach, in which q mixture components are transformed into q-1 independent mixture-related variables [8]. Each technique has advantages and disadvantages. In the classic mixture approach, the sum of the proportions must be 1; therefore the variables are not all independent. This allows the experimental region of interest to be defined more naturally, but the analysis of such experiments is more complicated. The MIV approach, with the variables independent, permits the use of classical factorial and response surface designs [9], but has the undesirable feature that the experimental region changes depending on how the q mixture components are reduced to q-1 independent factors.

In this report, the MIV approach is referred to as the factorial approach because factorial experiment designs form the basis of the approach. The following sections present general (nonrigorous) descriptions of each method (for a detailed discussion of these methods, see reference 3).

2.2.1 Mixture Approach

In the mixture approach, the total amount (mass or volume) of the product is fixed, and the settings of each of the q components are proportions. Because the total amount is constrained to sum to one, only q-1 of the factors (component variables) can be chosen independently.

Figure 1. Example of triangular simplex region from three-component mixture experiment. Diagram. The simplex is an equilateral triangle with each vertex representing a pure component (X subscript any given integer equals 1). The axis for each component extends from the vertex (X subscript any given integer equals 1) to the midpoint of the opposite side of the triangle (X subscript any given integer equals 0).

Figure 1. Example of triangular simplex region from three-component mixture experiment.

As a simple (hypothetical) example of a mixture experiment, consider concrete as a mixture of three components: water (x1), cement (x2), and aggregate (x3), where each xi represents the volume fraction of a component. Assume the coarse-to-fine aggregate ratio is held constant. The volume fractions of the components sum to one, and the region defined by this constraint is the regular triangle (or simplex) shown in figure 1. The axis for each component xi extends from the vertex it labels (xi = 1) to the midpoint of the opposite side of the triangle (xi = 0). The vertex represents the pure component. For example, the vertex labeled x1 is the pure water mixture with x1 = 1, x2 = 0, and x3 = 0, or (1,0,0). The point where the three axes intersect, with coordinates (1/3,1/3,1/3), is called the centroid.

Figure 2. Simplex-centroid design for three variables. Diagram. This figure is a triangular simplex with seven design points shown. The mixtures comprising this design include the three vertices, the three midpoints of edges, and the centroid (intersection of the 3 component axes).

Figure 2. Simplex-centroid design for three variables

A good experiment design for studying properties over the entire region of a three-component mixture would be the simplex-centroid design shown in figure 2 (this example is included as an illustration only, since much of this region would not represent either feasible or workable concrete mixtures). The points shown in figure 2 represent mixtures included in the experiment and include all vertices, midpoints of edges, and the overall centroid.

All responses (properties) of interest would be measured for each mixture in the design and modeled as a function of the components. Typically, polynomial functions are used for modeling, but other functional forms can also be used. For three components, the linear polynomial model for a response y is:

Equation 1:
Equation 1. Lowercase Y, which is the response of interest, equals lowercase B superscript asterisk subscript 0, plus lowercase B superscript asterisk subscript 1 times lowercase X subscript 1, plus lowercase B superscript asterisk subscript 2 times lowercase X subscript 2, plus lowercase B superscript asterisk subscript 3 times lowercase X subscript 3, plus lowercase E.

where the bi * are constants and e, the random error term, represents the combined effects of all variables not included in the model. For a mixture experiment, lowercase B superscript asterisk subscript 1; therefore, the model can be reparameterized in the form:

Equation 2:
Equation 2. Lowercase Y equals lowercase B subscript 1 times lowercase X subscript 1, plus lowercase B subscript 2 times lowercase X subscript 2, plus lowercase B subscript 3 times lowercase X subscript 3, plus lowercase E.

using lowercase B superscript asterisk subscript 0 equals lowercase B superscript asterisk subscript 0 times lowercase X superscript 1 plus lowercase X superscript 2 plus lowercase X superscript 3. This form is called the Scheffé linear mixture polynomial [7].

Similarly, the quadratic polynomial:

Equation 3:
Equation 3. Lowercase Y equals lowercase B subscript 0, plus lowercase B superscript asterisk subscript 1 times lowercase X subscript 1, plus lowercase B superscript asterisk subscript 2 times lowercase X subscript 2, plus lowercase B superscript asterisk subscript 3 times lowercase X subscript 3, plus lowercase B superscript asterisk subscript 12 times lowercase X subscript 1 times lowercase X subscript 2, plus lowercase B superscript asterisk subscript 13 times lowercase X subscript 1 times lowercase X subscript 3, plus lowercase B superscript asterisk subscript 23 times lowercase X subscript 2 times lowercase X subscript 3, plus lowercase B superscript asterisk subscript 11 times lowercase X squared subscript 1, plus lowercase B superscript asterisk subscript 22 times lowercase X squared subscript 2, plus lowercase B superscript asterisk subscript 33 times lowercase X squared subscript 3, plus lowercase E.

can be reparameterized as:

Equation 4:
lowercase y equals lowercase B subscript 0 lowercase X subscript 1 plus lowercase B subscript 2 lowercase X subscript 2 plus lowercase B subscript 2 lowercase X subscript 2 plus lowercase B subscript 3 lowercase X subscript 3 plus lowercase B subscript 12 lowercase X subscript 1 lowercase X subscript 3 plus lowercase B subscript 13 lowercase X subscript 1 lowercase X subscript 3 plus lowercase B subscript 23 lowercase X subscript 2 lowercase X subscript 3 plus lowercase E

using lowercase X to the power of 2 subscript 1 equals lowercase X subscript 1 times 1 minus lowercase X subscript 2 minus lowercase X subscript 3,lowercase X to the power of 2 subscript 2 equals lowercase X subscript 2 times 1 minus lowercase X subscript 1 minus lowercase X subscript 3, and lowercase X to the power of 2 subscript 3 equals lowercase X subscript 3 times 1 minus lowercase X subscript 1 minus lowercase X subscript 2.

Since feasible concrete mixtures do not exist over the entire region shown in figures 1 and 2, a subregion of the full simplex containing the range of feasible mixtures must be defined by constraining the component proportions. An example of a possible subregion for the three-component example is shown in figure 3. It is defined by the following volume fraction constraints (where x1 = water, x2 = cement, x3 = aggregate):

0.15 £ x1 £ 0.25

0.10 £ x2 £ 0.20

0.60 £ x3 £ 0.70

In this case the simplex designs are usually no longer appropriate and other designs are used [3].

Figure 3. Example of subregion of full simplex containing range of feasible mixtures. Diagram. This figure shows a highlighted subregion of a full triangular simplex. The subregion is defined by constraints on each X subscript any given integer, for example, 0.15 is less than X subscript 1 is less than 0.25. The subregion is the area that meets all of the constraints.

Figure 3. Example of subregion of full simplex containing range of feasible mixtures.

The advantage of the mixture approach is that the experimental region of interest is defined more naturally; however, analysis of the results can be complicated, especially if the number of components is greater than three.

2.2.2 Factorial (MIV) Approach

In the factorial approach, the q components of a mixture are reduced to q-1 independent variables using the ratio of two components as an independent variable. In the case of concrete, w/c is a natural choice for this ratio variable. For the situation with small q minus 1 independent variables, a 2small q minus 1 factorial design forms the backbone of the experiment. This design consists of several factors (variables) set at two different levels. As a simple example, consider a concrete mixture composed of four components: water, cement, fine aggregate, and coarse aggregate. Three independent factors, or variables, x, that can be selected to describe this system are x1= w/c (by mass), x2 = fine aggregate volume fraction, and x3 = coarse aggregate volume fraction. Reasonable ranges for these variables might be:

0.40 £ x1 £0.50

0.25 £ x2 £0.30

0.40 £ x3 £0.45

The levels for this example would be 0.40 and 0.50 for x1, 0.25 and 0.30 for x2, and 0.40 and 0.45 for x3. To simplify calculations and analysis, the actual variable ranges are usually transformed to dimensionless coded variables with a range of ±1. In this example, the actual range of 0.40 £ x£ 0.50 would translate to a coded range of -1 £ x1 £1. Intermediate values of x1 would translate similarly (e.g., the actual value of 0.45 would translate to a coded value of zero). The general equation used to translate from coded to uncoded is as follows:

Equation 5:
Equation 5. Lowercase X subscript actual equals lowercase X subscript min plus the quotient of the sum of lowercase X subscript coded plus 1, divided by 2, times the result of lowercase X subscript max minus lowercase X subscript min.

where xactual is the uncoded value, xmin and xmax are the uncoded minimum and maximum values (corresponding to -1 and +1 coded values), and xcoded is the coded value to be translated.

Suppose that the specifications for this mixture require a slump of 75 to 150 mm and a 28-day strength of 30 MPa. These specified properties are called the responses, or dependent variables, yi, which are the performance criteria for optimizing the mixture. For concrete, the responses may be any measurable plastic or hardened properties of the mixture. Cost may also be a response.

As with the mixture approach, empirical models are fit to the data, and polynomial models (linear or quadratic) typically are used. Equation 6 illustrates the general case of the full quadratic model for k =3 independent variables:

Equation 6:
Equation 6. Lowercase Y equals lowercase B subscript 0, plus lowercase B subscript 1 times lowercase X subscript 1, plus lowercase B subscript 2 times lowercase X subscript 2, plus lowercase B subscript 3 times lowercase X subscript 3, plus lowercase B subscript 12 times lowercase X subscript 1 times lowercase X subscript 2, plus lowercase B subscript 13 times lowercase X subscript 1 times lowercase X subscript 3, plus lowercase B subscript 23 times lowercase X subscript 2 times lowercase X subscript 3, plus lowercase B subscript 11 times lowercase X squared subscript 1, plus lowercase B subscript 22 times lowercase X squared subscript 2, plus lowercase B subscript 33 times lowercase X squared subscript 3, plus lowercase E.

In equation 6, the ten coefficients are represented by the bk and e is a random error term representing the combined effects of variables not included in the model. The interaction terms (xixj) and the quadratic terms (xi2) account for curvature in the response surface.

The central composite design(CCD), an augmented factorial design, is commonly used in product optimization. A complete CCD experiment design allows estimation of a full quadratic model for each response. A schematic layout of a CCD for k = 3 independent variables is shown in figure 4. The design consists of 2k (in this case, 8) factorial points (filled circles in figure 4) representing all combinations of coded values xk = ±1, 2*k (in this case, 6) axial points (hollow circles in figure 4) at a distance ±a from the origin, and at least 3 center points (hatched circle in figure 4) with coded values of zero for each xk. The value of a usually is chosen to make the

design rotatable[1], but there are sometimes valid reasons to select other values [3].

Figure 4. Schematic of a central composite design for three factors. Diagram. This figure shows a schematic of a central composite design for three factors. The framework for the design is a cube representing all combinations of coded values X subscript any given integer equals plus or minus 1, along with a center point (coded value for all X subscript any given integer equals 0), and axial points at distances of plus or minus alpha in each direction from the center point (0,0,0).

Figure 4: Schematic of a composite design for three factors

In the absence of curvature, a model with only linear terms would be sufficient, and the factorial portion of the CCD is a valid design by itself in that case. However, the presence or absence of significant curvature often is not known with certainty at the start. An advantage of the CCD over the mixture approach is that the CCD can be run sequentially in two blocks. The first block would consist of the factorial points (all combinations of xi = ±1) and some center points (at least 3), and the second block would consist of the axial points (points along each axis at distance a from the origin) and additional center points (at least 2). This approach allows analysis of the factorial portion before the axial portion is run. If curvature is insignificant based on the factorial portion, the additional runs are not necessary.

As shown in table 1, the number of coefficients in the quadratic model increases with k, and the number of trial batches required using a CCD begins to increase significantly for k> 5.

Table 1. Number of runs required for CCD experiment for k = 2 to 5 factors

k Factorial Axial Center* Total
244513
386519
4168529
516**
16**
10531
*assumes 3 CP for factorial portion and 2 CP for axial portion
**for k=5, a half-fraction of the full factorial is sufficient to estimate all linear terms and 2-factor interactions without confounding. Thus, 25-1 or 16 factorial points are usually sufficient

Therefore, using a CCD to optimize a concrete mixture of more than six components may be uneconomical. In such cases, one could identify the most important factors and limit them to five or fewer. For example, if the cementitious materials and chemical admixtures were the most important components, they would be varied, while the amounts of coarse and fine aggregate would be held constant.

Laboratory experiments were conducted using the mixture and factorial approaches to see if one was more appropriate for concrete mixture optimization. The experiments are described in chapters 3 and 4. The adaptability of each method for use as an interactive, Web-based program was also considered in developing the COST software, described in chapter 5.

2.3 Model Fitting and Validation

The polynomial models described in sections 2.2.1 and 2.2.2 are fit to data using analysis of variance (ANOVA) and least squares techniques [9]. Many statistical software packages have the capability to perform these analyses and fits. Once a model has been fit, it is important to verify the adequacy of the chosen model quantitatively and graphically.

Although the models (polynomials) are slightly different for the classical mixture approach and the factorial approach, many of the steps involved in model selection and fitting are the same. The first step in each case is to perform ANOVA to select the appropriate type of model (linear, quadratic, etc.). Sequential F-tests are performed, starting with a linear model and adding terms (quadratic, and higher if appropriate). Under the "Source" column of the ANOVA table, the line labeled "Linear" indicates the significance of adding linear terms, and the line labeled "Quadratic" indicates the significance of adding quadratic terms. The column labeled "DF" shows the degrees of freedom for each source. The F-statistic is calculated for each type of model, and the highest order model with significant terms normally would be chosen. Significance is judged by determining if the probability that the F-statistic calculated from the data exceeds a theoretical value. The probability decreases as the value of the F-statistic increases. If this probability is less than 0.05 (typically, although other levels of significance could be used), the terms are significant and their inclusion improves the model. An example of an ANOVA table for sequential model sum of squares is shown in table 2.

Table 2. Example of ANOVA sequential model sum of squares

SourceSum of Squares DFMean SquareF ValueProb > F
Mean 329.271329.27--
Linear 96.94519.3945.64< 0.0001
Quadratic 6.3615 0.421.000.5017
Special Cubic (aliased) 3.3570.481.260.3727
Cubic (aliased) 0.000---
Residual 3.038 0.38--
Total 438.9536 12.19--

In this example, the linear model is the highest order model with significant terms (Prob > F is less than 0.05); therefore, it would be the recommended model for this data. Typically, the selected model will be the highest order polynomial where additional terms are significant and the model is not aliased.

Once the type of model (e.g., linear, quadratic, etc.) is selected, the second step is to perform a lack-of-fit test, also using ANOVA, to compare the residual error to the pure error from replication. Table 3 is an example of ANOVA for lack of fit. If residual error significantly exceeds pure error, the model will show significant lack of fit, and another model may be more appropriate.

The desired result in a lack-of-fit test is that the model selected in step 1 will not show significant lack of fit (i.e., the F test will be insignificant). If the "Prob > F" value is less than the desired significance level (often .05), this indicates significant lack of fit.

Several summary statistics can be calculated for a model and used to compare models or verify model adequacy. These statistics include root mean square error (RMSE), adjusted r2, predicted r2, and prediction error sum of squares (PRESS). The RMSE is the square root of the mean square error, and is considered to be the standard deviation associated with experimental error. The adjusted r2 is a measure of the amount of variation about the mean explained by the model, adjusted for the number of parameters in the model [2].

Table 3. Example of ANOVA lack-of-fit test

Source Sum of Squares DF Mean Square F Value Prob > F
Linear6271.9322285.091.170.4335
Quadratic2164.867309.271.270.3703
Special Cubic (aliased)0.000---
Cubic (aliased)0.000---
Pure Error1950.918243.86--

The predicted r2 measures the amount of variation in new data explained by the model. PRESS measures how well the model fits each point in the design. To calculate PRESS, the model is used to estimate each point using all of the design points except the one being estimated. PRESS is the sum of the squared differences between the estimated values and the actual values over all the points. A good model will have a low RMSE, a large predicted r2, and a low PRESS.

After a model is selected, standard linear regression techniques (least squares) are used to fit the model to the data. ANOVA is performed and an overall F-test and lack-of-fit test confirm the applicability of the model. Summary statistics (r2, adjusted r2, PRESS, etc.) and the standard error for each model coefficient also are calculated.

For the factorial approach, an iterative model fitting process was used in this research. First, a full quadratic (or linear, if applicable) model is assumed, and significance tests (t-tests) are performed on each model coefficient. Insignificant terms are removed and the fitting process is repeated using a partial quadratic (or linear, if applicable) model. The significance tests are repeated and insignificant terms, if any, are removed. The process is repeated until there are no insignificant terms. At this point, if the model contains two-factor interaction or quadratic terms, it is checked for hierarchy. Hierarchical terms are linear terms that may be insignificant by themselves but are part of significant higher order terms. For example, x1and x3 are hierarchical terms of x1x3, a two-factor interaction term. If x1x3 is a significant term in the model, x1 and x3 are usually included in the model to maintain hierarchy. A hierarchical model allows for conversion of models between different sets of units (for a model involving temperature, conversion from F to C, for example).

Table 4 shows an ANOVA table for a selected model from a factorial experiment. Using the iterative approach, a reduced quadratic model was fit to the data. Note that terms B, C, and E are not significant ("Prob > F" >.05) but were added back into the model to make it hierarchical.

Table 4. Example of ANOVA model fitting for 1-day strength

Source Sum of Squares DF Mean Square F Value Prob > F
Model240.87830.1127.76< 0.0001
A213.261213.26196.66< 0.0001
B0.4810.480.450.5113
C0.0410.040.040.8433
E2.0612.061.900.1819
A26.2016.205.720.0257
AC5.1515.154.750.0404
AE7.1617.166.600.0175
BC6.5116.516.000.0227
Residual23.86221.08--
Lack of fit19.08181.060.890.6248
Pure error4.7841.19--
Corr. total264.7230     

Once the model fitting is performed, the next step is residual analysis to validate the assumptions used in the ANOVA. This analysis includes calculating case statistics to identify outliers and examining diagnostic plots such as normal probability plots and residual plots. If these analyses are satisfactory, the model is considered adequate, and response surface (contour) plots can be generated. Contour plots can be used for interpretation and optimization.

2.4 Optimization

When appropriate models have been established, several responses can be optimized simultaneously. Optimization may be performed using mathematical (numerical) or graphical (contour plot) approaches. Generally, graphical optimization is limited to cases in which there are only a few responses.

Numerical optimization requires defining an objective function (called a desirability or score function) that reflects the levels of each response in terms of minimum (zero) to maximum (one) desirability. One approach uses the geometric mean of the desirability functions for each individual response, where n is the number of responses to be optimized [10]:

Equation 7:
Equation 7. D equals the series lowercase D subscript 1 times lowercase D subscript 2 continuing through lowercase D subscript N, where lowercase N is the number of responses to be optimized, all to the 1 over lowercase N power.

Another approach is to use a weighted average of desirability functions:

Equation 8:
Equation 8. D equals the sum of lowercase W subscript lowercase 1 times lowercase D subscript lowercase 1, plus W subscript lowercase 2 times lowercase D subscript lowercase 2, continuing through lowercase W subscript lowercase N times lowercase D subscript lowercase N, where lowercase N is the number of responses, all divided by lowercase N

where n is the number of responses and wi are weighting functions ranging from 0 to 1.

Several types of desirability functions can be defined. Common types of desirability functions are shown in figure 5.

Figure 5. Examples of desirability functions. Drawings. This figure shows common types of desirability functions, including minimum, maximum, target, linear decreasing, linear increasing, and within range. "Minimum" has desirability of zero for values less than a selected value and one for values above it. "Maximum" has desirability of zero for values greater than a selected value, and one for values below it. "Target" has desirability of 1 at a designated target value within a selected range, with desirability decreasing linearly to zero at each limit of the range. "Linear (increasing)" has desirability of zero at the lower limit and one at the upper limit. "Linear (decreasing)" has desirability of one at the lower limit and zero at the upper limit. "Within range" has desirability of one within a designated range of values, and zero outside of that range.

Figure 5. Examples of desirability functions

These functions can also be expressed mathematically as well. For example, a linear desirability function where minimum is best would be expressed as:

Equation 9:
Equation 9. Lowercase D subscript lowercase I equals the quotient of the result of B subscript lowercase I minus Y subscript lowercase I, divided by the result of B subscript lowercase I minus A subscript lowercase I, all raised to the power of lowercase W subscript lowercase I.

for the range Ai Equal to or less than Yi Equal to or less than Bi and with wi = 1.

Once the desirability functions (and weights, if used) are defined for each response, optimization may proceed.

As an alternative to rigorous numerical methods, desirability can be evaluated by superimposing a grid of points at equal spacing over the experimental region and evaluating desirability at each point. The point(s) of maximum desirability can be found by sorting the results or by creating contour plots of desirability over the grid area.

 


1 If a design is rotatable, predicted values should have equal variances at locations equidistant from the origin.

2 The adjusted r2 differs from the "standard" r2, which is not adjusted for the number of parameters. The standard r2 can be made larger by adding more parameters to the model. This does not necessarily mean the model with more parameters is a better model.

 

Previous | Table of Contents | Next

Federal Highway Administration | 1200 New Jersey Avenue, SE | Washington, DC 20590 | 202-366-4000
Turner-Fairbank Highway Research Center | 6300 Georgetown Pike | McLean, VA | 22101