top of page

4.4.4 Fit, Diagnose Model and Center Points

Fit, Diagnose Model and Center Points Introduction This article explains how to fit regression and designed experiment models, diagnose their adequacy, and use center points in designed experiments. The focus is practical: how to build and verify a good model of the process response and how to use center points to detect curvature and improve the model. --- Fitting the Model Choosing the Model Form In improvement projects, models are usually based on: - Simple linear regression – one predictor, straight-line relationship. - Multiple linear regression – several predictors, straight-line approximation. - Designed experiment (DOE) models – main effects and interaction terms from factorial or fractional factorial designs. - Quadratic models – linear terms plus squares and sometimes interactions, often arising from center points or response surface designs. The model form must reflect: - The experimental design (what factors and levels were run). - Prior process understanding (suspected curvature or interactions). - The goal: prediction, explanation, or factor screening. Fitting Regression and DOE Models A typical model for k predictors is: - General form: Y = β₀ + β₁X₁ + β₂X₂ + … + βₖXₖ + error For DOE with two-level factors: - Coded units: -1 for low, +1 for high, 0 for center (if used). - Main effects: each factor alone. - Interactions: product terms (e.g., X₁X₂). Steps to fit: - Define response Y and predictors X’s. - Code factors if using DOE (−1, +1, and possibly 0). - Fit the model using least squares. - Examine the initial full model including relevant main effects and interactions. Selecting Terms: Main Effects and Interactions In factorial or regression modeling: - Main effects: impact of each factor individually. - Two-factor interactions: joint effect when two factors change together. - Higher-order interactions: three or more factors together, usually smaller and often omitted unless strongly indicated. Term selection typically uses: - Practical knowledge – which interactions are plausible in the process. - Statistical evidence – p-values, standardized effect plots, Pareto charts of effects. - Design resolution – which interactions can be estimated separately from main effects (in fractional factorials). A useful strategy: - Start with a hierarchical model: - Include significant interactions. - Keep the corresponding main effects even if not individually significant. - Remove clearly non-significant higher-order terms while maintaining hierarchy. --- Model Adequacy and Goodness of Fit Key Model Statistics After fitting, check indicators of fit: - R² (coefficient of determination) - Proportion of variation in Y explained by the model. - Higher is better, but very high R² alone does not guarantee a good model. - Adjusted R² - Adjusts for number of predictors. - Useful when comparing models with different numbers of terms. - Helps avoid overfitting. - Predicted R² - Based on cross-validation or prediction errors. - Assesses how well the model predicts new data. - Large gap between adjusted R² and predicted R² suggests overfitting or problems in the model. - Standard error of the estimate (s) - Typical size of residuals. - Smaller s indicates tighter fit around the regression line. ANOVA and F-test Analysis of variance (ANOVA) summarizes model fit: - Mean square for regression (MSR) – variation explained by the model. - Mean square error (MSE) – residual variation not explained by the model. - F-statistic = MSR / MSE - Tests the null hypothesis that all model coefficients (except intercept) are zero. Interpretation: - Large F and small p-value for the overall model: - Evidence that at least one predictor is significantly related to the response. - p-values for individual terms: - Indicate which factors or interactions are statistically significant. - Use in combination with practical importance and design structure. Lack-of-Fit Test The lack-of-fit test checks whether the chosen model form is adequate: - Requires replicated runs at the same factor settings (often at center points or other design points). - Partitioning of error: - Pure error – variation between replicates at identical settings. - Lack-of-fit – leftover variation when the model does not match the true mean response. Key ideas: - If lack-of-fit is statistically significant (small p-value): - The model form is missing important structure (e.g., curvature, interactions, nonlinearity). - Reconsider adding terms such as quadratic or omitted interactions. - If lack-of-fit is not significant: - No strong evidence that the model form is wrong, given the data. - Still verify assumptions and practical reasonableness. --- Model Diagnostics Residual Analysis Residuals are the differences between observed and fitted values: - Residual = Observed Y − Predicted Y They provide a way to test the assumptions of linear modeling: - Linearity – correct functional form. - Constant variance (homoscedasticity) – similar spread across predicted values and factors. - Independence – errors are not correlated. - Normality – residuals roughly normal (primarily important for inference). Common residual plots: - Residuals vs fitted values - Random scatter: supports linearity and constant variance. - Funnel shape: suggests non-constant variance (may need transformation). - Curved pattern: suggests nonlinearity or missing terms (e.g., squared terms). - Residuals vs each predictor - Helps detect patterns tied to specific factors. - Non-random patterns: consider additional terms (e.g., interactions, nonlinear terms). - Normal probability plot (or histogram) of residuals - Approximately straight line: supports normality. - Strong curvature or heavy tails: consider transformation or robust methods. - Residuals vs order (run sequence) - Detects time trends or drifts in process conditions. Influential Points and Leverage Not all points have equal impact on the model: - Leverage – numeric measure of how far a point’s predictor values are from the center of the design. - High leverage points can strongly influence the fitted line or surface. - Influence – combination of leverage and residual size. - Common measures: Cook’s distance and DFFITS (names only; interpretation depends on software cutoffs). Practical actions: - Investigate influential points: - Check for data errors, special causes, or unusual conditions. - Do not automatically delete them: - Decide based on process knowledge and data validity. - If a valid influential point reveals important behavior: - Consider redesigning the experiment to better cover that region. Transformations and Model Refinement If diagnostics show problems, possible fixes include: - Transform response: - Log, square root, or Box–Cox transformation to stabilize variance or improve normality. - Add missing terms: - Interactions or quadratic terms to capture curvature. - Simplify model: - Remove non-essential terms to reduce overfitting, especially if predicted R² is poor. Always refit and recheck residuals after any change. --- Center Points in Designed Experiments Purpose of Center Points Center points are runs at the mid-level of all quantitative factors in a factorial design: - For each factor: - Low = coded −1 - High = coded +1 - Center = coded 0 (midpoint between low and high) They serve key purposes: - Detect curvature – reveal departures from linearity between low and high settings. - Estimate pure error – provide replicate data at a consistent location in the design space. - Improve robustness – often represent a practically reasonable operating region. Center points are meaningful for quantitative factors with ordered levels. They are not applicable for categorical-only factors. Detecting Curvature with Center Points In a two-level factorial design, main effects are assumed linear between low and high. If the true response curve is curved, the straight-line model is inadequate. Center points help by comparing: - The average response at factorial points (all combinations of low and high). - The average response at center points. Logic: - If the true relationship is linear: - The mean at the center equals the mean of predictions from the linear model at the center. - If there is curvature: - The center-point mean differs systematically from the linear prediction. Statistical test: - Compare average response at center points to average of the factorial vertices. - A significant difference indicates overall curvature in at least one factor. Interpretation: - Significant curvature: - Add quadratic terms (e.g., X₁²) and possibly consider a response surface design. - No significant curvature: - A first-order (linear) model may be adequate within the studied range. Center Points and Pure Error When center points are replicated: - Replication at same settings gives: - An unbiased estimate of pure error (random variation in the process). - This pure error is crucial for: - Lack-of-fit tests – separating modeling error from random noise. - More accurate confidence intervals and significance tests on effects. Guidelines: - Use multiple center point runs (not just one) to: - Average out random noise. - Ensure a stable estimate of pure error. Planning and Using Center Points When designing experiments: - When to include center points: - When factors are quantitative and ranges are not extremely narrow. - When linearity across the studied range is uncertain. - When an estimate of pure error is needed for lack-of-fit tests. - How many center points: - Enough to provide a reasonable estimate of pure error (often 3–6 scattered through the run order). - Balance against overall experiment size. - Placement in the run order: - Randomly distributed or deliberately spread across the sequence (early, middle, late) to check stability over time. Practical uses after running: - If curvature is detected: - Consider transitioning from screening (two-level factorial) to a response surface design that includes axial and center runs. - If no curvature is detected: - Retain a simpler first-order model and focus on optimizing within the linear region. --- Integrating Fitting, Diagnosis, and Center Points Bringing these elements together: - Fit: - Build an initial linear (or linear–interaction) model using regression or DOE results. - Use coded variables when analyzing factorial designs. - Use center points: - Assess curvature and estimate pure error. - Run a curvature test: if significant, enrich the model with quadratic terms or extend the design. - Diagnose: - Check ANOVA, R² statistics, and lack-of-fit test. - Examine residual diagnostics for assumption violations. - Investigate influential points and re-evaluate suspicious data. - Refine: - Add or remove terms while maintaining model hierarchy. - Apply transformations when needed. - Reassess fit and diagnostics after each adjustment. The goal is a model that: - Accurately describes the response within the studied factor ranges. - Satisfies statistical assumptions reasonably well. - Provides reliable predictions and insights for process improvement and control. --- Summary A sound model arises from a disciplined cycle of fitting, diagnosing, and refining: - Fit the model using regression or DOE methods with appropriate main effects and interactions. - Assess adequacy through R² measures, ANOVA, and lack-of-fit tests. - Diagnose assumptions with residuals, normality checks, and influence measures. - Use center points to detect curvature and estimate pure error, enabling meaningful lack-of-fit testing. - Refine the model by adding curvature terms, adjusting the design, or simplifying the structure until the model both fits the data and makes practical sense. Mastering these steps ensures that conclusions about factor effects and process behavior rest on statistically valid, well-diagnosed models.

Practical Case: Fit, Diagnose Model and Center Points A medical-device assembly plant was struggling with inconsistent adhesive bond strength on a plastic component used in insulin pens. Failures appeared randomly during final inspection, causing rework and line stoppages. The Black Belt led a designed experiment focusing on three controllable factors in the curing process: oven temperature, cure time, and airflow setting. The goal was to understand which settings drove bond strength and how stable the process was across the operating range. The team ran the experiment and used “Fit” to build a regression model relating bond strength to the three factors and their interactions. The initial fit looked acceptable on summary statistics, but residual plots showed curvature and non-constant variance, suggesting the linear model was not capturing the true relationship. To diagnose the model, the team plotted residuals versus fitted values and versus each factor. Clear patterns appeared at the low and high ends of temperature, indicating lack of fit in those regions. They decided that the tested levels might be too wide and that the model needed better information in the middle of the operating range. They added center points to the design: runs at the mid-levels of temperature, time, and airflow. These center runs were replicated to check for pure process variability and to test for curvature. After collecting the new data, they refit the model, explicitly testing for curvature using the center point information. With center points included, the diagnostics showed a significant curvature effect for temperature but not for time or airflow. The updated model, now including a quadratic term for temperature, produced residual plots with no obvious patterns and constant variance, confirming an adequate fit. Using the improved model, the team selected a narrower, mid-range temperature window and standard cure time and airflow settings that maximized bond strength while minimizing variability. Subsequent production data confirmed a marked reduction in bond-strength failures and fewer line interruptions, with no additional equipment investment. End section

Practice question: Fit, Diagnose Model and Center Points A DOE model for a chemical process includes three numeric factors and uses a second-order regression model. The engineer adds center points to the design. What is the primary statistical purpose of including center points in this context? A. To estimate main effects more precisely B. To detect curvature in the response surface C. To reduce multicollinearity among factors D. To increase the number of replicates at all corner points Answer: B Reason: Center points, located at the midpoint of all factor ranges, allow testing whether the mean response at the center differs from the average of the factorial points, thereby detecting curvature and justifying a quadratic model. Other options do not reflect the main statistical role of center points in a second-order model. --- In fitting a multiple regression model for a DOE with three factors and center points, the residual plots show a fan-shaped pattern (increasing spread with fitted values). Which is the most appropriate Black Belt action? A. Transform the response (e.g., log or Box-Cox) and refit the model B. Remove all center points from the data and refit a linear model only C. Add more factors to the model to increase R-squared D. Ignore the pattern if the model p-values are all significant Answer: A Reason: A fan-shaped residual pattern indicates non-constant variance; applying an appropriate transformation and refitting addresses this violation of regression assumptions and improves model diagnostics. Other options ignore or worsen the violation, or change the model structure without addressing heteroscedasticity. --- A Black Belt fits a linear model Y = β0 + β1X1 + β2X2 for a 2^2 factorial with 4 center points. The mean response at the 4 center points is 50. The predicted response at the center from the fitted linear model is 55. Assuming adequate precision, what does this result most strongly indicate? A. The linear model is adequate across the design space B. There is evidence of curvature and the model may need quadratic terms C. There is strong multicollinearity between X1 and X2 D. The center points should be discarded as outliers Answer: B Reason: A substantial difference between the observed center-point mean and the linear model prediction indicates lack of fit due to curvature, suggesting that quadratic terms should be considered. Other options misinterpret the discrepancy or suggest unjustified data removal. --- During model diagnostics for a fitted regression model from a central composite design, the normal probability plot of residuals is approximately linear, but the residuals vs. fitted plot shows a systematic U-shaped pattern. What is the most appropriate interpretation? A. Normality assumption is violated; use a nonparametric test B. Linearity assumption is violated; the model form is missing terms C. Independence assumption is violated; residuals are autocorrelated D. Constant variance assumption is violated; residuals have unequal spread Answer: B Reason: A systematic U-shaped pattern in residuals vs. fitted values indicates lack of fit due to missing functional form (e.g., needed interaction or higher-order terms), despite normal residuals. Other options describe other assumption violations that do not correspond to a U-shaped residual pattern. --- A Black Belt designs an experiment on three factors at two levels each and includes 6 center points. After fitting a second-order model, the lack-of-fit test is non-significant (p = 0.45), and the pure error estimate is based entirely on the center points. What is the best conclusion regarding model adequacy? A. The second-order model is adequate within the studied region B. The model is overfitted and must be reduced to a linear model C. The model is inadequate because center points cannot estimate pure error D. The model must be discarded; more blocks should be added first Answer: A Reason: A non-significant lack-of-fit test using center-point replicates as pure error indicates no statistical evidence of lack of fit for the chosen model within the design space, supporting its adequacy. Other options misunderstand the role of center points in estimating pure error or prescribe unjustified model changes.

bottom of page