24h 0m 0s
🔥 Flash Sale -50% on Mock exams ! Use code 6sigmatool50 – Offer valid for 24 hours only! 🎯
4.3 Designed Experiments
Designed Experiments Purpose and Core Ideas Designed Experiments, or Design of Experiments (DOE), is a structured method for discovering how input factors affect outputs and for optimizing performance with the fewest trials possible. Key goals are to: - Identify critical factors and interactions - Quantify effects with statistical confidence - Find optimal settings of controllable factors - Build reliable prediction models for the process DOE goes beyond changing one factor at a time. It deliberately varies several factors simultaneously using a planned design, then uses statistical analysis to separate real effects from noise. --- DOE Foundation Concepts Factors, Levels, and Responses - Factor: A controllable input that can be set at different values (temperature, speed, pressure). - Level: A chosen value of a factor (low vs high; 50°C vs 80°C). - Response: The output measured (defect rate, yield, cycle time, strength). - Noise factor: A source of variation not directly controlled, or too costly to control, during normal operation. For basic designs: - Two-level designs use low and high settings, often coded as -1 and +1. - Responses can be continuous (e.g., time, weight) or binary (pass/fail). Main Effects and Interactions - Main effect: The isolated average impact of changing one factor from low to high, holding all else constant on average. - Interaction: When the effect of one factor depends on the level of another factor. Examples: - No interaction: Increasing temperature always increases strength by about the same amount, regardless of pressure. - Interaction: Increasing temperature helps at low pressure but hurts at high pressure. In designed experiments: - Interactions are often as important as main effects. - Failing to model interactions can lead to wrong conclusions. Experimental Error and Replication Random variation (error) is always present. DOE separates signal (real effects) from noise by: - Replication: Running the same combination of factor levels more than once. - Randomization: Running trials in random order to protect against time trends and hidden biases. - Blocking: Grouping similar experimental units to isolate known nuisance variables. These elements provide a basis for estimating experimental error, which is critical for statistical tests of significance. --- Planning a Designed Experiment Clarifying Objectives A DOE should start with a clear, quantitative objective: - Screen many factors to find the vital few - Understand relationships among key factors and responses - Optimize settings for best performance - Demonstrate robustness to variation (tolerance and sensitivity) Define: - Primary response(s): The measure(s) to be improved. - Secondary response(s): Measures to be monitored (e.g., cost, side effects). - Constraints: Practical limits on factors, time, cost, or risk. Selecting Factors and Ranges Choose: - Factors that are controllable and believed to influence the response. - Levels that are: - Safe and practical to run - Wide enough to detect meaningful changes - Centered on current or feasible operating conditions Check: - Avoid including too many unimportant factors in a high-resolution optimization design. - Consider separating screening (many factors, simple designs) from optimization (few factors, more detailed designs). Experimental Units, Randomization, and Blocking - Experimental unit: The entity that receives a treatment combination (part, batch, patient, machine setup). - Randomize trial order to: - Break correlation between factor settings and time-related effects. - Reduce risk that hidden trends bias the results. - Use blocking when: - You must run the experiment in batches (shifts, days, lots). - Conditions change between blocks but are relatively stable within blocks. A block is treated like a nuisance factor whose effect is removed from the error term, improving sensitivity to real factor effects. --- Two-Level Factorial Designs Full Factorial Designs A full factorial design includes all combinations of factor levels. For k two-level factors: - Number of runs = 2^k (ignoring replication and blocks). - Example with 3 factors (A, B, C): 2^3 = 8 combinations. Advantages: - Estimates all main effects and all interactions up to order k. - Clear interpretation (no aliasing). - Good for small numbers of factors. Disadvantages: - Grows quickly with number of factors. - Can become impractical beyond about 5 or 6 factors. 2^k Designs: Structure and Coding Two-level factorial designs use coded values: - Low level: -1 - High level: +1 Each row of the design: - Represents one run with specific factor settings. - Often includes computed columns for interactions (e.g., AB, AC, BC). In analysis: - Effects are estimated using contrasts based on these coded levels. - Coding simplifies interpretation: - Main effect of A is the change in response when A goes from -1 to +1, averaged over other factors. --- Fractional Factorial Designs Motivation for Fractional Designs When many factors need screening, a full factorial may be too large. Fractional factorial designs use a carefully chosen subset (fraction) of runs: - Example: 2^(7-3) design - 7 factors - 2^(7-3) = 2^4 = 16 runs instead of 128 Advantages: - Efficient for factor screening. - Allow estimation of many main effects and some low-order interactions with fewer runs. Trade-off: - Some effects are aliased (confounded) with others, meaning they cannot be distinguished from each other based on the data alone. Generators, Defining Relation, and Aliasing Fractional factorials are built using generators: - A generator defines a factor in terms of others. - Example: For a 2^(4-1) design with 4 factors (A, B, C, D): - Generator: D = ABC - This creates a half fraction (16 full runs → 8 fractional runs). The defining relation is obtained by: - Multiplying the generator by itself and combining all equivalent forms. - Example: D = ABC ⇒ ABCD = I, where I is the identity. The defining relation produces alias structure, which shows which effects are confounded. Alias examples: - From I = ABCD, multiplying both sides by A: - A = BCD - Interpretation: Main effect A is aliased with three-way interaction BCD. In practice: - High-order interactions (3-way or higher) are often assumed negligible. - Choose a design so that important effects (main effects and low-order interactions) are not aliased with each other. --- Resolution of Fractional Factorial Designs Design Resolution Definitions Resolution characterizes how clearly a design separates different orders of effects. - Resolution III: - Main effects may be aliased with two-factor interactions. - Two-factor interactions are aliased with other two-factor interactions. - Resolution IV: - Main effects are clear of two-factor interactions. - Two-factor interactions may be aliased with other two-factor interactions. - Resolution V: - Main effects are clear of two- and higher-factor interactions. - Two-factor interactions are clear of other two-factor interactions, but may be aliased with three-factor interactions. Higher resolution is better for detecting interactions clearly, but often requires more runs. Choosing an Appropriate Resolution Guidelines: - For screening many factors: - Resolution III or IV designs are common. - Priority is identifying main effects. - For detailed understanding and modeling: - Resolution V or full factorial designs are preferred. - Priority is capturing key two-factor interactions. Balance: - Run size vs clarity. - Use subject-matter knowledge to: - Decide which interactions are plausible and important. - Tolerate aliasing only between clearly unimportant effects. --- Center Points and Curvature Purpose of Center Points Center points are runs at the mid-level of all quantitative factors: - Used in 2-level designs to detect curvature in the response. - Provide an estimate of pure error when replicated. Center points help answer: - Is the true relationship approximately linear in the chosen factor ranges? - Are we operating near a minimum or maximum? Detecting Curvature Compare: - Average response at center points vs average of the corner points. If: - Center-point mean is significantly different from the mean of the factorial points, this indicates curvature. - If curvature is significant, a simple two-level (linear) model is inadequate, and a more complex (e.g., quadratic) model or response surface design is needed. --- Blocking in Designed Experiments Why Block? Blocking accounts for known nuisance variation not of primary interest, such as: - Day-to-day differences - Machine-to-machine differences - Operator shifts By blocking: - Variation due to these nuisance sources is removed from the error term. - Sensitivity to detect factor effects improves. Implementing Blocks In design: - Blocks are included as an additional factor but treated differently in analysis. - Blocks are not randomized within each block, but run order within a block should still be randomized. In analysis: - Block effects are estimated and removed from the residual error. - The focus remains on the effects of controllable process factors. --- Analysis of Variance (ANOVA) for DOE Purpose of ANOVA ANOVA assesses whether factor effects and interactions are statistically significant compared to random variation. Key ideas: - Partition total variation in the response into components: - Variation due to each factor and interaction - Variation due to blocks (if used) - Residual or error variation - Compare mean squares using F-tests to judge significance. Key ANOVA Elements - Sum of Squares (SS): Quantifies variation attributable to each source. - Degrees of Freedom (df): Reflects the number of independent pieces of information. - Mean Square (MS): SS divided by df. - F-ratio: MS for a factor / MS for error. - p-value: Probability of obtaining an F as large as observed if the true effect were zero. Interpretation: - Small p-value: Factor or interaction is statistically significant. - Large p-value: No evidence of a real effect beyond random noise. Assumptions: - Residuals are independent. - Residuals are approximately normally distributed. - Residuals have constant variance across fitted values and factor levels. --- Estimating Effects and Building Models Effect Estimates For two-level designs: - The effect of a factor is typically: - The difference between the average response at high level and the average at low level. - Interaction effects: - Difference in differences, representing how the effect of one factor changes over levels of another. Effect estimates are used to: - Rank factors by their impact on the response. - Identify patterns of interaction. Regression Modeling ANOVA and regression are closely related. A DOE model is often written as: - Response = Intercept + (Main effects) + (Interactions) + Error For example, with three factors A, B, C: - Y = β0 + βA·A + βB·B + βC·C + βAB·AB + βAC·AC + βBC·BC + ε Where: - A, B, C are coded factor levels (-1, +1). - AB, AC, BC are coded interaction terms (products of coded levels). - β’s are coefficients estimated from data. Model use: - Predict response for any combination of factor levels within the studied range. - Understand direction and magnitude of factor effects. - Support optimization and control decisions. --- Checking Model Adequacy Residual Diagnostics Model adequacy is assessed by examining residuals (observed − fitted). Common checks: - Normal probability plot of residuals: - Points should approximate a straight line. - Residuals vs fitted values: - Look for constant spread; avoid funnel shapes. - Check for patterns suggesting nonlinearity or missing terms. - Residuals vs run order: - Look for time trends, drifts, or cycles. If diagnostics reveal problems: - Consider adding missing terms (interactions, curvature). - Transform the response if variance is not constant. - Re-examine measurement system and experimental execution. Transformation and Alternatives When assumptions are violated: - Apply transformations (e.g., log, square root) for nonconstant variance. - For proportions or counts, use appropriate link functions and generalized linear models when available. However, transformations and advanced models should not distract from core DOE logic: - Clear design - Careful execution - Thoughtful interpretation --- Optimization and Confirmation Using the Model for Optimization With a fitted model: - For a single response: - Identify factor settings that maximize or minimize the predicted response. - Use contour plots or response surface visualization when available. - For multiple responses: - Explore trade-offs among responses. - Seek compromise settings that are acceptable on all key measures. Optimization should: - Respect practical constraints on factor levels. - Avoid extrapolation far beyond the experimental region. Confirmation Experiments After selecting candidate optimal settings: - Run confirmation experiments at those settings. - Compare: - Observed results vs model predictions. - If the confirmation results agree (within expected error): - The model is validated for use within the studied range. - If not: - Re-examine assumptions, design, measurement, and execution. - Adjust the model or consider a new experiment if needed. --- Response Surface Methods (Overview) When to Move Beyond 2-Level Factorials Two-level factorial and fractional factorial designs primarily model linear effects and interactions. When: - Curvature is detected via center points, or - Fine-tuning around an optimum is needed then more advanced designs are appropriate to estimate quadratic effects, such as: - Central Composite Designs (CCD) - Box–Behnken Designs These include: - Center points - Axial or intermediate points - Enough runs to estimate squared terms (e.g., A², B²) and interactions Goals of Response Surface Methods - Accurately describe curved response surfaces. - Find true local optima (maxima, minima, saddle points). - Support robust optimization within a continuous factor space. In practice: - Often use a two-stage approach: - First: Screening via fractional factorial. - Second: Local optimization via response surface design on the key factors. --- Practical Considerations in DOE Robustness to Noise Designed experiments can also explore robustness: - Deliberately vary important noise factors (when possible). - Look for factor settings where the response is relatively insensitive to noise. This supports: - Stable performance under real-world conditions. - Less reliance on tight control of every input. Common Pitfalls Avoid: - Too narrow factor ranges that fail to show meaningful changes. - Ignoring strong interactions because the design cannot estimate them. - Over-interpreting non-significant effects. - Violating randomization without appropriate blocking or modeling. Pay attention to: - Clear objectives. - Practical feasibility of experimental runs. - Discipline in data collection and run execution. --- Summary Designed Experiments provide a systematic way to: - Plan efficient, structured tests of multiple factors and their interactions. - Use two-level factorial and fractional factorial designs to screen and model. - Manage trade-offs between run size, resolution, and aliasing. - Apply ANOVA and regression to identify statistically significant effects. - Check model adequacy with residual diagnostics. - Use models for optimization, robustness, and confirmation of improved settings. - Extend to response surface methods when curvature and fine-tuning are important. When planned, executed, and analyzed correctly, Designed Experiments reveal how a process truly works and how to set it for high, predictable performance.
Practical Case: Designed Experiments A pharmaceutical packaging line is sealing blister packs for tablets. About 6% of packs fail the seal integrity test, causing rework and shipment delays. The process engineer suspects several factors may affect seal quality: sealing temperature, dwell time, and sealing pressure. Operators have different opinions about which matters most, and past one-factor-at-a-time trials gave inconsistent results. The team designs a structured experiment with planned combinations of the three factors at two levels each. They randomize the run order, collect pass/fail data for each run, and keep all other conditions (foil type, tablet, machine) constant. Analysis of the experiment shows: - Temperature and dwell time both significantly affect seal integrity. - Pressure has little effect within the tested range. - There is a strong interaction between temperature and dwell time (high temperature with long dwell time overheats the foil and increases failures). Based on the results, the team sets a slightly higher temperature with a shorter dwell time and leaves pressure at the current standard. After confirming with a few verification runs, seal failures drop from about 6% to under 1%, with no increase in material cost or cycle time. End section
Practice question: Designed Experiments A Black Belt is planning an experiment to study the effect of three factors (A, B, C), each at two levels, on a continuous response. The team wants to be able to estimate all main effects and two-factor interactions, but not three-factor interactions. Which design is most appropriate? A. 2³ full factorial design B. 2³⁻¹ fractional factorial design (Resolution III) C. 2³⁻¹ fractional factorial design (Resolution IV) D. Plackett-Burman design with 8 runs Answer: C Reason: A Resolution IV design allows estimation of all main effects clear of two-factor interactions and can estimate two-factor interactions aliased only with other two-factor interactions, satisfying the requirement while using fewer runs than a full 2³ design. Other options either confound main effects with two-factor interactions (Resolution III, Plackett–Burman) or are more resource-intensive than needed (full factorial). --- A Black Belt runs a 2² full factorial experiment with factors A and B at two levels, with one replicate per treatment combination. The average responses for the four combinations are: (A−,B−)=10, (A+,B−)=18, (A−,B+)=12, (A+,B+)=20. What is the estimate of the interaction effect AB? A. 0 B. 2 C. 4 D. 8 Answer: B Reason: AB effect = [(A+,B+) + (A−,B−) − (A+,B−) − (A−,B+)] / 2 = (20 + 10 − 18 − 12) / 2 = 0 / 2 = 0? This indicates no AB effect, but check arithmetic: (20 + 10) = 30; (18 + 12) = 30; difference = 0; /2 = 0 → Effect = 0. Other options represent non-zero interaction magnitudes inconsistent with the contrast calculation from the cell means. --- A Black Belt is reviewing an experimental result and observes that the normal probability plot of standardized effects from a 2⁴⁻¹ design shows three effects clearly off the straight line: A, C, and AC. All other effects lie close to the line. Which interpretation is most appropriate? A. Only factor A is significant; ignore C and AC B. Factors A and C, and their interaction AC, are likely significant C. All main effects are likely significant; interactions are not D. The design is invalid because interactions should not be estimated Answer: B Reason: In the normal probability plot of effects, points that fall far from the reference line indicate significant effects; thus A, C, and AC are likely significant contributors to the response and should be retained in the model. Other options either ignore visually significant terms, overgeneralize significance to all main effects, or mischaracterize the design’s validity. --- A Black Belt is asked to study 5 two-level factors with very limited resources and is only interested in screening for potentially important main effects, accepting that interactions may be confounded. Which design choice is most appropriate? A. 2⁵ full factorial design B. 2⁵⁻¹ Resolution IV design C. Plackett–Burman design with 12 runs D. 2² factorial for two factors at a time Answer: C Reason: A Plackett–Burman design is a high-efficiency screening design for many factors, primarily estimating main effects while intentionally confounding higher-order interactions, and a 12-run design is appropriate for screening 5 two-level factors with limited runs. Other options either require too many runs (full factorial), provide higher resolution than necessary, or are inefficient and do not allow simultaneous screening of all 5 factors. --- A Black Belt fits a DOE regression model and obtains the following ANOVA summary: Model p-value < 0.001, R² = 85%, adjusted R² = 83%, but the lack-of-fit test p-value = 0.01. Residual plots show curvature not captured by the model. What is the most appropriate next step? A. Conclude the model is adequate because the model p-value is significant B. Ignore the lack-of-fit result because R² is high C. Add center points and/or quadratic terms to capture curvature D. Drop all interaction terms to simplify the model Answer: C Reason: A significant lack-of-fit p-value and curvature in residual plots indicate that the current model form is inadequate; adding center points or second-order (quadratic) terms is appropriate to model curvature and reduce lack of fit. Other options incorrectly rely only on global significance or R², or simplify the model in a way that would likely worsen fit rather than address curvature.
