U.S. Department of Transportation
Federal Highway Administration
1200 New Jersey Avenue, SE
Washington, DC 20590
Federal Highway Administration Research and Technology
Coordinating, Developing, and Delivering Highway Transportation Innovations
|This report is an archived publication and may contain dated technical, contact, and link information|
Publication Number: FHWA-RD-03-060
This chapter describes the application of factorial experiment design to the problem of concrete mixture optimization. In a mixture, the total amount (mass or volume) of the product is fixed, and the settings of each of the q components are proportions. Because the total amount is constrained to sum to one, the component variables are not independent. However, the q components can be reduced to q-1 mathematically independent variables using the ratio of two components as an independent variable . In the case of concrete, w/c is a natural choice for this ratio variable. If this strategy is adopted, a factorial approach may be applied.
4.2 Selection of Materials, Proportions, and Constraints
The materials used in this experiment were identical to those used in the mixture experiment (chapter 3): Type I cement, tap water, #57 crushed limestone, natural sand, silica fume (slurry), and naphthalene-sulfonate-based HRWRA. The six components were reduced to five independent variables: x1 = w/c (by mass), x2 = fine aggregate volume fraction, x3 = coarse aggregate volume fraction, x4 = HRWRA volume fraction, x5 = silica fume volume fraction.
The volume fraction ranges for coarse aggregate, fine aggregate, HRWRA, and silica fume were the same as used in the mixture experiment. The range of w/c (by volume) was calculated from the volume fraction limits of the mixture experiment. The lower limit of w/c was 0.16 ÷ 0.15 = 1.067, and the upper limit was 0.185 ÷0.13 = 1.423. These limits were equated to coded limits of -1.5 to +1.5 to give the most similar experimental region to that of the mixture experiment. Table 9 shows the settings of each variable corresponding to coded values of -1 and +1.
Table 9. Variable settings corresponding to coded values (factorial experiment)
|Variable||ID||Low Setting (coded value = -1)||High Setting (coded value = 1)|
|w/c ratio||x1||1.1263 (@ 0.36 by mass)||1.3637 (@ 0.43 by mass)|
For the aggregates, HRWRA, and silica fume, volume fractions were converted to batch masses (batch volume for HRWRA) based on total batch size and material properties. The batch masses of water and cement were calculated by constraining the sum of volume fractions of all six components to sum to one. This gives two equations in two unknowns (the other equation being the w/c expressed volumetrically) that can be solved for volume fraction of water and volume fraction of cement. Entrapped air was ignored in these calculations, although in practice it will affect yield calculations and an overall adjustment to the mixture proportions may be necessary for proper yield.
The performance criteria for the mix were as follows: slump between 50 and 100 mm, 1-day strength greater than 22 MPa, 28-day strength greater than 51 MPa, and charge passed in the RCT less than 700 coulombs. These are the same performance criteria used for the mixture experiment, except that the 1-day and 28-day strengths were rounded to 22 and 51 MPa (from 22.06 and 51.02 MPa used in the mixture experiment)1.
4.3 Experiment Design Details
A central composite design (see chapter 2) was chosen for this experiment. The actual values for each variable (expressed in terms of volume fraction) corresponding to the coded levels ±1 are shown in table 10. A commercially available software package for experiment design and analysis was used to plan the experiment. Thirty-one batches were planned in two nearly orthogonal blocks. The first block consisted of a half-fraction of 16 factorial points2 and 3 center points, and the second block consisted of 10 axial points and 2 center points. A total of five center points was chosen to allow use of the center point mixes as statistical control mixes to assess week-to-week variation, if any, over the five weeks of mixing (in addition to the use of center points as replicates to estimate pure error). The run order within each block was randomized to reduce the effect of uncontrolled variables. Center point runs were placed first and last (based on the total number of runs) with the remainder equally spaced, resulting in three center points in the factorial block and two center points in the axial block. The mixture proportions for each batch are shown in table 10.
One batch (run 3) was repeated at the end of the experiment because of suspiciously low strength values. A center point batch was included with the repeat to check statistical control. The results of the repeated batch were used in subsequent analyses.
4.4 Specimen Fabrication and Testing
The materials used in this study included a portland cement conforming to ASTM specification C150-94 Type I/II, a #57 crushed limestone coarse aggregate meeting grading requirements of AASHTO M43, a natural sand fine aggregate, silica fume (in slurry form), a naphthalene-sulfonate based HRWRA (ASTM C494-98 Type F/G), and municipal tap water. Thirty-one batches of concrete, each approximately 0.04 m3 in volume, were prepared over a 6-week period. Each batch was prepared in accordance with mixing procedures set forth in ASTM C192-95.
Table 10. Mixture proportions (per m3) for factorial experiment
|Batch (run)||w/c||Fine Agg (kg)||Coarse Agg (kg)||HRWRA (L)||Silica Fume (kg)||Cement (kg)||Water (kg)|
Precautions were taken to compensate for mortar retained by the mixer, by "buttering" the mixer prior to preparing each batch. The concrete was mixed using a rotating-drum machine mixer with a 0.17 m3 mixing capacity.
Each batch included sufficient concrete for 2 slump tests, 1 unit weight and air content test (ASTM C138 and ASTM C231), and 10 100 mm x 200 mm cylinders. All cylinders were fabricated in accordance with ASTM C192, except that external vibration (a vibrating table) was utilized for consolidation in lieu of rodding when slump was less than 50 mm. Immediately following casting, the cylinders were covered with plastic and left in the molds at room temperature for 24 ± 8 hours, after which they were removed from the molds and moist cured at 23 ± 2 ºC until testing. Specimens were tested for compressive strength in accordance with ASTM C39 at the ages of 1 and 28 days. In most cases, three cylinders were tested at each age, however, where anomalies in either specimen condition or test results were evident, a fourth or fifth specimen may have been tested. Prior to compression testing, the ends of each cylinder were ground plane and parallel in accordance with ASTM C39 tolerances. Three of the remaining cylinders from each batch were used for the RCT testing. Testing was performed according to ASTM C1202, except that the 50 mm test specimens were cut from the middle of each cylinder instead of from the top. All RCTs were performed at an age of 42 days from casting.
4.5 Results and Analysis
The average values for slump, 1-day strength, 28-day strength, and charge passed (RCT) for each batch are shown in table 11, along with the estimated cost (dollars per cubic meter) of the concrete mixture. The costs were calculated from the mixture proportions for each batch, based on approximate component costs obtained from a local (mid-Atlantic) ready-mix concrete producer. Each response was analyzed individually by examining summary plots of the data (scatterplots and means plots), fitting a model using ANOVA and least-squares methods, validating the model by examining the residuals for trends and outliers, and interpreting the model graphically. A detailed discussion of the analysis procedure for 1-day strength is presented in the following 2 sections. The analyses for the other responses were performed in a similar manner. The models for the other responses are reported in section 4.5.4.
4.5.2 Exploratory Data Analysis for 1-Day Strength
One advantage of the factorial approach is the ability to perform an initial assessment of the data using graphical techniques such as raw data plots, means plots, scatterplots, and cube plots. These techniques are illustrated in figures 12-15 (for complete sets of plots for each response, see appendix B). A raw data plot of 1-day strength is shown in figure 12. This plot illustrates the variation in the response (1-day strength) over time, for each run. The control batch results, shown as hollow squares, give an indication of consistency over time. In this case, the control results are all about equal, indicating no time-related effects. Raw data plots are also helpful in identifying suspect data values, which may result from (for example) errors in data recording or data entry, equipment malfunction, or poor specimen fabrication.
Table 11. Test results and costs (factorial experiment)
Figure 12. Raw data plot for 1-day strength (factorial experiment)
Figure 13 is a scatterplot showing the effects of varying one factor (w/c) on the 1-day strength. In this case, as expected, the 1-day strength decreases with increasing w/c.
Figure 13. Scatterplot showing effect of w/c on 1-day strength (factorial experiment)
Figure 14 shows a means plot for 1-day strength. The means plot allows comparison of the effects of each factor. In this experiment, 1-day strength was clearly influenced by w/c. Other factors appear to have had little effect.
Figure 14. Means plot for 1-day strength (factorial experiment)
Finally, figure 15 shows a cube plot of 1-day strength with respect to three factors (w/c, fine aggregate, and coarse aggregate). A cube plot is a convenient means of assessing quantitative effects of three factors on a response.
Figure 15: Example of cube plot of factorial points
4.5.3 Model Fitting and Validation for 1-Day Strength
After assessing the data graphically, the second step in analysis is to estimate an appropriate model for each response. As with the mixture approach, ANOVA and least-squares regression techniques are used. The first step is to use ANOVA to determine the appropriate type of model (e.g., linear, quadratic). An ANOVA table for sequential model sum of squares, shown in table 12, suggests that both linear and quadratic terms are significant. A lack-of-fit ANOVA table (table 13) suggests that a quadratic model has insignificant lack-of-fit ("Prob > F" = 0.8339).
Table 12. Sequential model sum of squares for 1-day strength (factorial experiment)
|Source||Sum of Squares||DF||Mean Square||F Value||Prob > F|
Table 13. Lack-of-fit test for 1-day strength (factorial experiment)
|Source||Sum of Squares||DF||Mean Square||F Value||Prob > F|
When a central composite design is used, the full quadratic model can be estimated, but often some of the terms are not significant. The following procedure3 was used to identify an appropriate reduced quadratic model:
1) Fit the full quadratic model and for each coefficient, calculate the t-statistic for the null hypothesis that the coefficient is equal to zero.
2) Perform the regression again with a partial model containing only those terms that are statistically significant (i.e., those terms that had a t-statistic greater than that for the chosen significance level, in this case 0.05). Calculate the t-statistics again and drop any terms which are not significant.
3) Repeat step 2 until the partial model contains only significant terms.
4) Add first-order terms required to make the model hierarchical. Hierarchical polynomial models make sense under linear transformations such as changing units of temperature from Celsius to Fahrenheit . All second-order terms that appear in the model must have corresponding first-order terms included in order to make the model hierarchical.
The ANOVA for the final hierarchical model (with the hierarchical terms shaded) is shown in table 14.
Table 14. ANOVA for 1-day strength model (factorial experiment)
|Source||Sum of Squares||DF||Mean Square||F Value||Prob > F|
|Lack of fit||19.08||18||1.06||0.89||0.6248|
For one-day strength, y2, the fitted model was:
The adequacy of each fitted model was validated quantitatively by calculating statistical measures such as residual standard deviation and PRESS, and graphically by examining residual plots. The residual standard deviation, s, for this model is 0.99 MPa. A value of s near the repeatibility value (replicate standard deviation calculated from center points) is an indication of an adequately fitting model. For this experiment, the repeatibility value is 1.04 MPa, which is close to s. The PRESS statistic (see page 12) is a measure of how well the model fits each point in the design (the smaller the PRESS statistic, the better the fit). PRESS and some other quantitative indicators of model adequacy are shown in table 15.
Table 15. Summary statistics for 1-day strength model (factorial experiment)
The residuals are the deviations of the observed data values from the fitted values, xI , and are estimates of the error terms ei in the model. The ei are assumed to be random and normally distributed with mean equal to zero and constant standard deviation. A normal probability plot of the residuals (shown in figure 16) is used to assess the validity of this assumption. If the error terms follow a normal distribution, they will fall on a straight line on the normal probability plot. Because they are estimates of the error terms, the residuals should exhibit similar properties.
If the assumptions are valid, plots of the residuals versus run sequence, predicted values, and other independent variables should be random and structureless. If structure remains in the residuals, residual plots may suggest modifications to the model that will remove the structure.
Figure 16: Example of a normal probability plot for model validation
Figure 17 shows a plot of residuals versus run sequence for 1-day strength. The plot shows no significant structure to the residuals.
Figure 17. Example of a resident plot (residuals vs. run) for model validation
4.5.4 Models for Other Responses
Using the same procedure described above for 1-day strength, the following models were fit to slump (x1), 28-day strength (x3), and 42-day RCT (x4) results:
The objective of optimization may be to find the "best settings" (settings which maximize or minimize a particular response or responses) or to meet a set of specifications. In either case, optimization usually involves considering several responses simultaneously. The same optimization strategy that was used in the classical mixture approach can be used for a factorial approach (see chapter 3).
4.6.1 Graphical Optimization
For three or fewer responses, contour plots can be useful in identifying optimum settings. Individual contour plots can be used to identify best settings for each response. Figure 18 shows a contour plot of 28-day strength as a function of w/c and silica fume with HRWRA at its middle setting. Figure 19 shows the same plot but with HRWRA at the high setting. These plots illustrate that the highest strength is reached at the high level of HRWRA coupled with the low levels of w/c and silica fume.
Contour plots can also be overlaid with the constraints for each response defining a subarea of settings that meet the response criteria. For example, figure 20 shows all settings meeting the following criteria: RCT < 700 and slump equal 50 to 100 mm. The white area in the plot indicates the settings meeting the criteria. The gray area on the plot shows the region of settings that do not meet the constraints simultaneously. Figure 21 shows the same plot with the added constraint that 28-day strength > 51 MPa. As constraints are added, the feasible region usually gets smaller. In some situations, no settings will meet all of the criteria.
Figure 18. 28-day strength in w/c and silica fume (HRWRA) at middle setting)
Figure 19. 28-day strength in w/c and silica fume (HRWRA at high setting)
Figure 20. Overlay plot for RCT <700 and slump = 50-100 mm
Figure 21. Overlay plot for RCT < 700, slump = 50 - 100 mm, and 28-day strength > 51 MPa4.6.2 Numerical Optimization
For more than three responses considered simultaneously, numerical optimization is often more practical than graphical optimization. The numerical optimization technique using desirability functions described in chapter 2 and used in the classical mixture experiment (chapter 3) can also be applied to the factorial experiment. Desirability functions for this experiment are similar to those used in the mixture experiment, except that in some cases (e.g, cost) the endpoint values are different. The desirability functions are shown in figure 22.
Figure 22. Desirability functions for factorial experiment
Numerical optimization gave the following best settings for this concrete mixture: w/c = 0.367 (by mass), fine aggregate volume fraction = 0.285, coarse aggregate volume fraction = 0.408, HRWRA volume fraction = 0.0061, and silica fume volume fraction = 0.0153. The predicted response values and associated uncertainties (at a 95 percent confidence level) are slump = 74 ±19 mm, 1-day strength = 22.08 ±1.17 MPa, 28-day strength = 58.65 ±2.32 MPa, and RCT = 378 ±30 coulombs, at a cost of $100.68 per m3.
4.6.3 Accounting for Uncertainty
As described for the mixture approach, the fitted functions (models) and values predicted from them have uncertainty associated with them because they are estimated from data. For example, for the optimal mixture proportions shown above the predicted 1-day strength is 22.08 ±1.17 MPa. If these proportions are used, we can be 95 percent confident that the true 1-day strength will lie between 20.91 and 23.25 MPa. But because the specified 1-day strength is 22 MPa, it is quite possible that the true 1-day strength will fall below the specified value. Therefore, the constraints must be modified to account for the uncertainty in the fitted functions. The uncertainties for the optimal mixture can be used to define modified constraints, and a new set of optimal mixture proportions can be identified for these new constraints. The predicted responses based on the new optimal proportions must be checked to see that the original specifications are met.
The modified constraints are slump between 69 and 81 mm, 1-day strength greater than 23.17 MPa, 28-day strength greater than 53.32 MPa, and RCT less than 670 coulombs. The best mixture for this new set of constraints is w/c = 0.358 (by mass), fine aggregate volume fraction = 0.282, coarse aggregate volume fraction = 0.4071, HRWRA volume fraction = 0.0062, and silica fume volume fraction = 0.0153. The predicted response values and associated uncertainties (at a 95 percent confidence level) are slump = 74 ±20 mm, 1-day strength = 23.17 ±1.26 MPa, 28-day strength = 59.62 ±2.68 MPa, and RCT = 363 ±32 coulombs, at a cost of $101.65 per m3. All but one of the lower or upper bound values for the responses now meet the original specifications. The exception is the lower bound for 1-day strength, which is 21.91 MPa (compared with the specified value of 22 MPa). In practice, this small difference is probably insignificant; however, it may be worthwhile to investigate ways to increase 1-day strength for this mix. A slightly lower w/c (0.35, for example) would probably be a sufficient remedy. The predictive models estimated from the experiment can be used to predict responses for settings anywhere within the experimental design space (i.e., anywhere within the defined variable ranges). However, extrapolation beyond the design space is not recommended.
In the factorial experiment, accounting for uncertainty increased the cost of the optimal mix only slightly (from $100.68 to $101.65). As with the mixture experiment, the value of optimization is evident when the cost of the optimal mix ($101.65) is compared with the range of mixture costs in the experiment ($90.56 to $124.87).
1 The mixture experiment was originally performed using English units with 1-day strength requirement of 3200 psi and 28-day strength requirement of 7400 psi.
2 The 16 factorial points represent a half-fraction of the full factorial for 5 variables, which has 25 = 32 points. The 25-1 half-fraction can be used in this case because it is a Resolution V design that allows estimation of all linear coefficients and two-factor interactions without confounding.
3 This procedure may be used because the CCD experiment design is well-balanced and orthogonal (or nearly so).