3.3.4 Types of Hypothesis Test

Types of Hypothesis Test Foundations of Hypothesis Testing Hypothesis testing is a structured way to use sample data to make decisions about population parameters. The main goal is to decide whether there is enough statistical evidence to support a claim about a mean, proportion, variance, or relationship between variables. The choice of test type depends on: - The question being asked (means, proportions, variances, relationships) - Data type (continuous or discrete) - Number of groups or samples - Whether samples are independent or paired - Assumptions about distribution and variance All hypothesis tests follow the same general logic: - Define null and alternative hypotheses - Choose an appropriate test and significance level - Calculate a test statistic and p-value - Compare p-value with significance level to make a decision Core Decision Dimensions Type of Data and Parameter of Interest The first step is to clarify what parameter is being tested. - Means (μ) For continuous data where averages are relevant. - Examples: cycle time, weight, length - Proportions (p) For discrete data expressed as a fraction or percentage. - Examples: defect rate, error rate - Variances (σ²) and Standard Deviations (σ) For variability and consistency. - Examples: spread of delivery times, spread of dimensions - Relationships and Differences in Groups When comparing multiple groups or predicting one variable from another. - Examples: effect of factor levels, correlation, regression Number and Relationship of Samples Tests also depend on how many samples are being compared and how they are related. - One-sample tests Compare a sample to a known or target value. - Two-sample independent tests Compare two separate groups with no natural pairing. - Example: Machine A vs Machine B - Paired (dependent) tests Compare before/after or matched data from the same or linked units. - Example: before vs after improvement on same process - Multiple-sample tests Compare three or more groups or factor levels. Hypotheses: One-Tailed vs Two-Tailed Null and Alternative Hypotheses Each test uses: - Null hypothesis (H₀) States no effect, no difference, or equality. - Alternative hypothesis (H₁ or Hₐ) States the effect or difference of interest. Direction of the Test - Two-tailed tests Check for any difference from a target. - H₀: parameter = target - H₁: parameter ≠ target - One-tailed tests Check for a difference in a specific direction. - Upper tail (greater than) - H₀: parameter ≤ target - H₁: parameter > target - Lower tail (less than) - H₀: parameter ≥ target - H₁: parameter < target The type of tail is determined by the practical question and must be set before analyzing data. Tests for Means One-Sample z and t Tests Use these to compare a sample mean to a known or target mean. - One-sample z test - Parameter: mean (μ) - Use when: - Population standard deviation is known - Data are approximately normal, or sample size is large - Typical application: comparing a process mean to a specification target when long-term σ is known - One-sample t test - Parameter: mean (μ) - Use when: - Population standard deviation is unknown (usual case) - Data are approximately normal or sample size moderate - Typical application: small to moderate samples, no known σ Two-Sample t Tests (Independent Samples) Used to compare means of two independent groups. - Two-sample t test (pooled or unpooled) - Parameter: difference in means (μ₁ − μ₂) - Use when: - Both groups are independent - Data in each group are approximately normal - Two forms: - Equal variances assumed (pooled) - Unequal variances (Welch’s t test) Choice between equal and unequal variance versions is often supported by a variance test, but practical robustness is also considered. Paired t Test Used when data are naturally paired. - Paired t test - Parameter: mean of differences (μd) - Use when: - Measurements are taken on the same units before and after a change - Or units are matched in pairs - Analysis is performed on the difference between paired observations This test is often more powerful than independent tests when a strong pairing exists. Tests for Proportions One-Sample Proportion Test Used to compare a sample proportion to a target or known proportion. - One-sample z test for proportion - Parameter: proportion (p) - Use when: - Sample size is large enough for normal approximation - Data are pass/fail, yes/no, defect/ok - Typical application: checking if a defect rate exceeds a requirement Two-Sample Proportion Test Used to compare two independent proportions. - Two-sample z test for proportions - Parameter: difference in proportions (p₁ − p₂) - Use when: - Two independent samples of attribute data - Sample sizes large enough for normal approximation - Typical application: comparing defect rates from two lines or two methods Tests for Variance and Standard Deviation One-Sample Chi-Square Test for Variance Used to test whether a process variance or standard deviation equals a specified value. - Chi-square test for one variance - Parameter: variance (σ²) or standard deviation (σ) - Use when: - Data are from a normal distribution - Applications: - Testing if process variation meets a requirement - Verifying change in variability after a modification Two-Sample F Test for Equality of Variances Used to compare the variances of two independent samples. - F test for two variances - Parameter: ratio of variances (σ₁² / σ₂²) - Use when: - Data in both groups are approximately normal - Common use: - Checking equality of variances before choosing a two-sample t test approach Comparing Multiple Means: ANOVA One-Way ANOVA Analysis of variance (ANOVA) compares mean responses across more than two groups based on one factor. - One-way ANOVA - Parameter: set of group means (μ₁, μ₂, …, μk) - Hypotheses: - H₀: all group means are equal - H₁: at least one mean is different - Use when: - One categorical factor with three or more levels - Data in each group approximately normal, with similar variances - Typical application: comparing multiple machine settings or suppliers Two-Way ANOVA and Factorial ANOVA These extend ANOVA to two or more factors. - Two-way ANOVA / factorial ANOVA - Parameters: main effects and interactions between factors - Use when: - Studying two or more factors simultaneously - Key outputs: - Main effect tests for each factor - Interaction tests to see if factor effects depend on each other Hypothesis tests in ANOVA are based on F statistics comparing between-group variation to within-group variation. Tests for Relationships and Association Tests for Correlation and Simple Linear Regression These tests examine the linear relationship between two continuous variables. - Test for correlation (Pearson’s r) - Parameter: correlation coefficient (ρ) - Hypotheses: - H₀: ρ = 0 (no linear correlation) - H₁: ρ ≠ 0, ρ > 0, or ρ < 0 - Use when: - Both variables are continuous and approximately bivariate normal - t test on regression slope - Parameter: slope (β₁) - Hypotheses: - H₀: β₁ = 0 (no linear relationship) - H₁: β₁ ≠ 0, β₁ > 0, or β₁ < 0 - Use when: - Fitting a simple linear regression model - The slope test and correlation test are closely related Multiple Regression and Model Terms In multiple regression, hypothesis tests are used to evaluate individual terms and overall fit. - t tests on regression coefficients - Parameters: individual coefficients (βi) - Hypotheses for each term: - H₀: βi = 0 (no effect) - H₁: βi ≠ 0 (effect present) - Use to decide which predictors are statistically significant - F test for overall regression - Parameter: model as a whole - Hypotheses: - H₀: all slope coefficients are zero - H₁: at least one slope coefficient is nonzero These tests determine whether the regression model explains a meaningful portion of variation in the response. Chi-Square Test for Independence Used to test association between two categorical variables. - Chi-square test for independence - Parameter: association between row and column categories - Hypotheses: - H₀: variables are independent - H₁: variables are associated - Use when: - Data are counts in a contingency table - Typical application: relationship between defect type and shift, or between cause category and line Nonparametric Alternatives When assumptions like normality or equal variance are seriously violated, nonparametric tests provide alternatives that rely less on distribution assumptions. - Mann–Whitney test (Wilcoxon rank-sum) - Alternative to: two-sample t test - Parameter: difference in central tendency of two independent groups - Wilcoxon signed-rank test - Alternative to: one-sample t test or paired t test - Parameter: median difference - Kruskal–Wallis test - Alternative to: one-way ANOVA - Parameter: differences in distribution central tendencies among multiple groups - Mood’s median test or other median-based tests - Used when comparing medians across groups under minimal assumptions Nonparametric tests usually test hypotheses about medians or general shift in distributions rather than means. Choosing the Correct Test Stepwise Selection Logic Selecting the right test requires answering a sequence of focused questions. - What parameter is of interest? - Mean, proportion, variance, relationship, or association - How many samples or groups are being compared? - One, two, or more than two - Are samples independent or paired? - Independent groups vs repeated or matched measures - What type of data is used? - Continuous vs discrete (pass/fail, counts) - Can parametric assumptions be reasonably satisfied? - Approximate normality - Similar variances across groups - Sufficient sample size - Is the hypothesis directional or non-directional? - One-tailed vs two-tailed Typical Mapping from Situation to Test - One mean vs target, σ unknown - One-sample t test - One mean vs target, σ known, large n - One-sample z test - Two independent means - Two-sample t test (equal or unequal variances) - Paired before/after means - Paired t test - One proportion vs target - One-sample z test for proportion - Two independent proportions - Two-sample z test for proportions - More than two means (one factor) - One-way ANOVA - More than two means (multiple factors) - Factorial ANOVA - Testing variance against a standard - Chi-square test for one variance - Comparing two variances - F test for two variances - Linear relationship between two continuous variables - Correlation test and slope t test in regression - Association between two categorical variables - Chi-square test for independence - Violated normality or strong outliers - Nonparametric alternatives such as Mann–Whitney, Wilcoxon, Kruskal–Wallis Common Patterns and Pitfalls Matching Test to Practical Question The statistical test must reflect the real decision question. - Do not use a two-sample test when data are naturally paired - Do not use a test for means on proportion data - Do not mix measurement scales (continuous vs categorical) incorrectly Assumption Awareness Understanding which assumptions matter for each test is essential. - Parametric mean tests and ANOVA: - Approximate normality - Similar variances across groups - Proportion tests: - Sufficient sample sizes for normal approximation - Variance tests: - Stronger normality requirement - Nonparametric tests: - Fewer distribution assumptions but may be less powerful When assumptions are doubtful, consider: - Data transformation - Nonparametric alternatives - Robust interpretation of results Summary Hypothesis tests are specific tools for answering focused questions about population parameters using sample data. The correct test is determined by: - The parameter of interest (mean, proportion, variance, relationship, association) - The number and relationship of samples (one, two, multiple; independent or paired) - The data type (continuous or discrete) - The direction of the hypothesis (one-tailed or two-tailed) - The validity of distribution and variance assumptions Key families of tests include: - z and t tests for means - Proportion tests for rates and percentages - Chi-square and F tests for variances and associations - ANOVA for comparing multiple means - Regression and correlation tests for relationships - Nonparametric tests when parametric assumptions fail Mastery of types of hypothesis tests involves correctly mapping real-world questions and data structures to the appropriate statistical test and understanding the core assumptions behind each choice.

Practical Case: Types of Hypothesis Test A medical device plant is facing rising customer complaints about blood glucose test strips failing in the field. Management suspects a recent raw-material change may be causing the issue. The Black Belt leads an investigation focused on different types of hypothesis tests to isolate the drivers of failures. First, they compare the average strip sensitivity (a continuous CTQ) before vs. after the material change using a 2-sample t-test. This checks whether the mean sensitivity has shifted since the change. Next, they verify whether the proportion of failed strips (pass/fail attribute data) has increased after the change using a 2-proportion test. This tests whether the failure rate is statistically higher with the new material. They then compare strip sensitivity across three production lines using one-way ANOVA to see if any line is producing significantly different performance levels with the same material. Finally, they test if the variance of strip sensitivity has increased with the new material using an F-test for variances, since unstable variability could trigger more failures at the specification limits. The 2-sample t-test and 2-proportion test both show significant worsening after the material change; ANOVA shows all lines are similarly affected; the F-test confirms increased variability. The team concludes the new material is the root cause and reverts to the previous supplier, after which complaint rates return to baseline. End section

Practice question: Types of Hypothesis Test A Black Belt must compare the average cycle time of a new automated process to the historical mean of 18 minutes, using 30 independent observations from the new process. The population standard deviation is unknown and the team assumes normality. Which is the most appropriate hypothesis test? A. 1-sample t-test (two-sided) B. 1-sample Z-test (two-sided) C. 2-sample t-test (two-sided) D. Paired t-test (two-sided) Answer: A Reason: There is one sample mean compared to a known historical mean, σ is unknown, and n is relatively small, so a 1-sample t-test is appropriate. Other options: B requires known population σ; C is for two independent samples; D is for matched pairs on the same units. --- A team wants to determine whether defect proportion has decreased after a process change. Before the change, 200 units were inspected with 40 defects. After the change, 250 units were inspected with 30 defects. Which hypothesis test is most appropriate? A. 1-proportion Z-test B. 2-proportion Z-test C. Chi-square goodness-of-fit test D. 2-sample t-test Answer: B Reason: Two independent samples of binomial data (defective / non-defective) are compared to test a change in proportions, so a 2-proportion Z-test is appropriate. Other options: A is for one proportion only; C compares observed vs expected categorical frequencies, not two proportions; D is for comparing means, not proportions. --- A Black Belt needs to check whether three different suppliers have the same mean tensile strength for a critical component. Data for tensile strength (continuous, approximately normal) are collected from each supplier, with similar subgroup sizes and no pairing. Which is the most appropriate test? A. 1-way ANOVA B. 2-sample t-test (pooled) C. Kruskal–Wallis test D. Chi-square test of independence Answer: A Reason: One continuous response and one categorical factor with three independent levels requires a 1-way ANOVA to test equality of means. Other options: B only compares two means; C is nonparametric and usually used if normality is violated; D is for categorical–categorical relationships, not continuous responses. --- In a gauge R&R study, a Black Belt wants to test whether three appraisers give the same average measurement for a reference part, assuming normality and equal variances. Each appraiser measures the same part multiple times. Which hypothesis test best addresses the equality of appraisers’ means? A. Paired t-test B. 1-way ANOVA (fixed factor: appraiser) C. 2-proportion Z-test D. Chi-square goodness-of-fit test Answer: B Reason: The goal is to compare mean measurements across more than two groups (three appraisers) on a continuous response, so a 1-way ANOVA by appraiser is appropriate. Other options: A is for two related means; C compares proportions, not means; D is used for categorical counts vs expected frequencies. --- A Black Belt is investigating whether a new training reduces average handling time compared with the current method using a crossover design. Each of 20 agents handles calls using Method A (current) and Method B (new), in randomized order, and the difference in time per agent is of interest. Which test is most appropriate? A. 2-sample t-test (independent samples) B. Paired t-test (1-sample t on differences) C. 1-way ANOVA D. 2-proportion Z-test Answer: B Reason: The same agents use both methods, creating matched pairs; the analysis should use a paired t-test (equivalently a 1-sample t-test on the within-agent differences). Other options: A assumes independent samples; C is generally for ≥3 groups; D is for proportions, not continuous time data.

23h 59m 59s

🔥 Flash Sale -50% on Mock exams ! Use code 6sigmatool50 – Offer valid for 24 hours only! 🎯

3.3.4 Types of Hypothesis Test