3.5.7 One and Two Sample Proportion

One and Two Sample Proportion Foundations of Proportion Analysis What is a Proportion? A proportion is a special kind of average used when data are in categories such as pass/fail, yes/no, or defect/acceptable. - Definition: A proportion is the fraction of observations in a category of interest. - Notation: - ( p ) = true population proportion (unknown) - ( \hat{p} ) = sample proportion (estimate of ( p )) - Formula for sample proportion: [ \hat{p} = \frac{x}{n} ] where - ( x ) = number of “successes” (events of interest) - ( n ) = total sample size Examples: - Defect rate in a process: “defective” vs “non-defective” - Response to a survey: “satisfied” vs “not satisfied” - Compliance: “meets spec” vs “does not meet spec” When to Use Proportion Methods Use proportion-based methods instead of mean-based methods when: - The outcome is binary (two categories). - The metric of interest is a percentage, rate, or fraction. - You count how many items fall into a category, rather than measuring a continuous quantity. Examples of appropriate questions: - Has the defect rate improved after a process change? - Is the current nonconformance proportion higher than the specified target? - Is the defect rate different between two machines, shifts, or suppliers? Data Structure and Assumptions Bernoulli and Binomial Context Binary data can be modeled using: - Bernoulli trial: a single trial with two possible outcomes (success/failure). - Binomial distribution: number of successes ( X ) in ( n ) independent Bernoulli trials with success probability ( p ). Key properties: - ( X \sim \text{Binomial}(n, p) ) - Sample proportion ( \hat{p} = X/n ). Core Assumptions For standard one- and two-sample proportion methods: - Independence: - Each observation is independent of others. - The outcome of one item does not affect another. - Binomial structure: - Fixed number of trials ( n ). - Each trial has the same probability ( p ) of success. - Only two outcomes per trial. - Sample size condition for normal approximation: - For one-sample: - ( n\hat{p} \ge 5 ) and ( n(1 - \hat{p}) \ge 5 ) (some sources use 10 instead of 5; be conservative in borderline cases). - For two-sample: - ( n1\hat{p}1,, n1(1-\hat{p}1),, n2\hat{p}2,, n2(1-\hat{p}2) ) all sufficiently large. When these conditions hold, the sampling distribution of ( \hat{p} ) (or ( \hat{p}1 - \hat{p}2 )) can be approximated by a normal distribution, which enables z-tests and confidence intervals. One-Sample Proportion: Concepts and Testing One-Sample Proportion Scenario A one-sample proportion situation compares a single observed proportion to a reference value (target or specification). Typical questions: - Is the nonconformance rate greater than the customer requirement? - Has the defect rate fallen below a specified goal? - Is the proportion of late deliveries different from a historical rate? Components: - Observed sample proportion ( \hat{p} = x/n ) - Hypothesized population proportion ( p_0 ) (target or baseline) Hypotheses for One-Sample Proportion Formulate hypotheses using the population proportion ( p ): - Two-sided test (difference in either direction): - ( H0: p = p0 ) - ( Ha: p \ne p0 ) - One-sided test (increase of concern): - ( H0: p \le p0 ) - ( Ha: p > p0 ) - One-sided test (decrease of concern): - ( H0: p \ge p0 ) - ( Ha: p < p0 ) Choose: - Two-sided when any difference matters. - One-sided when only higher or only lower values are practically important. Test Statistic for One-Sample Proportion Assuming the normal approximation is valid: - Standard error under ( H_0 ): [ SE0 = \sqrt{\frac{p0(1 - p_0)}{n}} ] - z-test statistic: [ z = \frac{\hat{p} - p0}{SE0} ] Interpretation: - Large positive ( z ): sample proportion much higher than ( p_0 ). - Large negative ( z ): sample proportion much lower than ( p_0 ). - Near-zero ( z ): sample proportion close to ( p_0 ). The p-value is calculated from the standard normal distribution using the chosen alternative hypothesis. One-Sample Proportion Confidence Interval To estimate the true proportion ( p ), construct a confidence interval (CI): - Standard error using sample estimate: [ SE = \sqrt{\frac{\hat{p}(1 - \hat{p})}{n}} ] - Confidence interval: [ \hat{p} \pm z_{\alpha/2} \times SE ] where: - ( z_{\alpha/2} ) is the critical value for the desired confidence level (for example, 1.96 for 95%). - The interval provides a plausible range for the true proportion. Interpretation: - If the hypothesized proportion ( p_0 ) lies outside the CI, it is inconsistent with the data at that confidence level. Sample Size for One-Sample Proportion Tests To plan a study, determine the sample size needed to detect a meaningful difference with acceptable risk. Key ingredients: - Baseline or target proportion ( p_0 ). - Minimum detectable difference ( \Delta = |p - p_0| ) considered practically important. - Significance level ( \alpha ) (probability of Type I error). - Power ( 1 - \beta ) (probability of detecting the difference, where ( \beta ) is Type II error). Approximate formulas use the normal approximation and involve both: - ( z_\alpha ): critical value for significance level. - ( z_\beta ): critical value for power. Conceptual relationships: - Larger ( \Delta ) → smaller sample size. - Lower ( \alpha ) or higher power → larger sample size. - Proportions closer to 0.5 require larger samples (maximum variability). Statistical software is typically used to compute the final sample size. Two-Sample Proportion: Concepts and Testing Two-Sample Proportion Scenario Two-sample proportion analysis compares proportions from two independent groups. Examples: - Defect rate on Line A vs Line B. - Error rate before vs after a process change (if treated as independent samples). - Proportion of on-time deliveries from Supplier 1 vs Supplier 2. Components: - Group 1: ( x1 ) successes in ( n1 ) trials → ( \hat{p}1 = x1/n_1 ). - Group 2: ( x2 ) successes in ( n2 ) trials → ( \hat{p}2 = x2/n_2 ). Hypotheses for Two-Sample Proportion Use the population proportions ( p1 ) and ( p2 ): - Two-sided test: - ( H0: p1 = p_2 ) - ( Ha: p1 \ne p_2 ) - One-sided test (greater): - ( H0: p1 \le p_2 ) - ( Ha: p1 > p_2 ) - One-sided test (less): - ( H0: p1 \ge p_2 ) - ( Ha: p1 < p_2 ) The choice depends on the practical question: - Are the rates different at all? - Is one rate specifically higher or lower than the other? Test Statistic for Two-Sample Proportion When testing equality, a pooled estimate of the common proportion under ( H_0 ) is used: - Pooled proportion: [ \hat{p} = \frac{x1 + x2}{n1 + n2} ] - Standard error under ( H_0 ): [ SE0 = \sqrt{\hat{p}(1 - \hat{p})\left(\frac{1}{n1} + \frac{1}{n_2}\right)} ] - z-test statistic: [ z = \frac{\hat{p}1 - \hat{p}2}{SE_0} ] Interpretation: - Large positive ( z ): sample suggests ( p1 > p2 ). - Large negative ( z ): sample suggests ( p1 < p2 ). - Near-zero ( z ): sample proportions are similar. The p-value is obtained from the standard normal distribution based on the alternative hypothesis. Two-Sample Proportion Confidence Interval To estimate the difference ( p1 - p2 ), use an unpooled standard error: - Unpooled standard error: [ SE = \sqrt{\frac{\hat{p}1(1 - \hat{p}1)}{n1} + \frac{\hat{p}2(1 - \hat{p}2)}{n2}} ] - Confidence interval for difference: [ (\hat{p}1 - \hat{p}2) \pm z_{\alpha/2} \times SE ] Interpretation: - If the CI for ( p1 - p2 ) includes 0, there is no statistically significant difference at that confidence level. - The sign of the CI (all positive or all negative) shows which group has the higher proportion. - The width of the CI reflects precision (narrower = more precise). Note the distinction: - Hypothesis test: uses pooled standard error under null. - Confidence interval: uses unpooled standard error based on separate estimates. Sample Size for Two-Sample Proportion Tests Planning a two-sample comparison requires: - Baseline proportion ( p_1 ) (for example, current rate). - Expected or targeted new proportion ( p_2 ). - Difference of interest ( \Delta = |p1 - p2| ). - Significance level ( \alpha ). - Desired power ( 1 - \beta ). - Allocation of sample sizes (often ( n1 = n2 ) for simplicity). Conceptual effects: - Smaller target difference → larger sample sizes. - More extreme proportions (near 0 or 1) usually allow smaller samples than near 0.5 for the same difference and error levels. - Unequal sample sizes can be used, but balanced designs are typically more efficient for a given total sample size. Software uses normal approximations to solve for required ( n1 ) and ( n2 ). Interpreting and Applying Proportion Results Statistical Versus Practical Significance It is important to distinguish: - Statistical significance: - Based on p-value and confidence interval. - Addresses whether the observed difference is unlikely under the null hypothesis. - Practical (or practical business) significance: - Based on the size of the difference in proportion. - Addresses whether the difference matters in real-world terms. Examples: - A tiny reduction in defect rate (for example, 0.2%) might be statistically significant with a very large sample but might not justify a major process change. - A moderate change (for example, from 4% to 2% defects) may be both statistically and practically meaningful. Error Types and Risks Key error concepts in proportion testing: - Type I error (α): - Concluding there is a difference when none exists. - Controlled by choosing the significance level (for example, 0.05). - Type II error (β): - Failing to detect a true difference of interest. - Controlled by sample size and chosen power. In planning and interpreting tests: - Lower ( \alpha ) protects against false alarms. - Higher power protects against missing important changes. - There is a trade-off: reducing both errors simultaneously typically requires larger samples. Checking Assumptions and Common Pitfalls Practical checks before relying on proportion results: - Independence check: - Avoid using multiple observations from the same item as separate trials. - Be cautious with data from time sequences where dependence may exist (for example, time-series clustering). - Sample size condition: - Verify that expected numbers of successes and failures in each group are large enough. - If any expected count is very small, consider exact tests (such as exact binomial or Fisher’s exact) rather than normal approximations. - Misclassification risk: - Ensure that the definition of “success” is consistent and that classification errors are minimized. - Mislabeling can bias estimated proportions and distort conclusions. - Multiple testing: - When testing many proportions (for example, many segments or suppliers), the chance of at least one false positive increases. - Recognize that without adjustment, some significant results may arise by chance alone. Summary One- and two-sample proportion methods are used when outcomes are binary and the key metric is a proportion or percentage. For one-sample problems, the focus is on comparing a single observed proportion ( \hat{p} ) with a reference value ( p0 ), using a z-test and confidence interval based on the binomial distribution and its normal approximation. For two-sample problems, the goal is to compare proportions from two independent groups, using pooled standard error for hypothesis testing and unpooled standard error for confidence intervals on ( p1 - p_2 ). Sound application requires understanding and checking assumptions (independence, binomial structure, adequate sample size for the normal approximation), setting clear hypotheses, interpreting p-values and confidence intervals, and distinguishing statistical from practical significance. Thoughtful sample size planning ensures that tests have adequate power to detect meaningful differences in proportions, while controlling error risks.

Practical Case: One and Two Sample Proportion A call center wants to reduce defect calls, defined as calls where the customer has to call back within 24 hours for the same issue. Context The center recently trained agents on a new troubleshooting script. Management needs to know: 1. Is the current defect rate above the internal target of 8%? (one sample proportion) 2. Did the new script reduce defect calls compared to the old script? (two sample proportion) Problem From one month of data after training: - They collect a random sample of calls from all agents using the new script. - They also have a comparable sample from the month before training (old script). They want data-driven evidence, not opinions, to decide whether to keep, adjust, or drop the script. How One Sample Proportion Was Applied They test whether the current defect proportion with the new script is still higher than the 8% target. - Define: p = proportion of calls that become defect calls using the new script. - Use the post-training sample only. - Conduct a one sample proportion test comparing p to 0.08. - The test shows p is significantly lower than 8%. Result: Management confirms the new script meets the internal quality target. How Two Sample Proportion Was Applied They then compare performance before vs. after the new script: - Group 1: Pre-training calls (old script), with proportion p₁ of defect calls. - Group 2: Post-training calls (new script), with proportion p₂ of defect calls. - Conduct a two sample proportion test of p₁ vs. p₂. The test shows p₂ is statistically lower than p₁. Result: Management concludes the new script truly reduced defect calls, not just by chance, and standardizes it across all teams. End section

Practice question: One and Two Sample Proportion A call center wants to test whether the proportion of calls answered within 30 seconds is greater than 80%. A random sample of 200 calls shows 170 were answered within 30 seconds. At α = 0.05, which is the most appropriate hypothesis test? A. One-sample z-test for mean B. One-sample z-test for proportion C. Two-sample z-test for proportion D. Chi-square goodness-of-fit test Answer: B Reason: The parameter of interest is a single population proportion (calls answered within 30 seconds) being compared to a known target (0.80). With a sufficiently large sample, the correct test is a one-sample z-test for proportion. Other options test means, compare two proportions, or test distributions, which do not match this single-proportion scenario. --- A Black Belt needs to compare the defect rate between two production lines (Line 1 and Line 2). From 400 units on each line, Line 1 has 28 defective units and Line 2 has 16 defective units. Which test statistic formula is appropriate for testing equality of proportions at α = 0.05? A. ( z = \dfrac{\hat{p}1 - \hat{p}2}{\sqrt{\hat{p}(1-\hat{p})(\frac{1}{n1}+\frac{1}{n2})}} ), where (\hat{p}) is the pooled proportion B. ( t = \dfrac{\bar{x}1 - \bar{x}2}{\sqrt{sp^2(\frac{1}{n1}+\frac{1}{n2})}} ), where (sp) is pooled variance C. ( z = \dfrac{\bar{x}1 - \bar{x}2}{\sqrt{\sigma^2(\frac{1}{n1}+\frac{1}{n2})}} ), where (\sigma) is known D. ( \chi^2 = \sum\dfrac{(O-E)^2}{E} ) without pooling proportions Answer: A Reason: Two independent proportions are compared (defect rates on two lines). The correct test statistic is the two-sample z-test for proportions using the pooled estimate of the overall proportion. Other options are for means or a general chi-square without using the z-form appropriate for two-sample proportion comparison. --- A service process has a historical complaint rate of 5%. After a process change, 300 transactions are sampled and 9 complaints are observed. Using a one-sample proportion test at α = 0.05, what is the correct null and alternative hypothesis formulation? A. H0: p = 0.05; H1: p ≠ 0.05 B. H0: p ≥ 0.05; H1: p < 0.05 C. H0: p ≤ 0.05; H1: p > 0.05 D. H0: p = 0.05; H1: p < 0.05 Answer: B Reason: The project goal after improvement is to reduce the complaint rate below the historical 5%, so the alternative should be p < 0.05. For a directional (one-sided) test, H0 must contain the complement (p ≥ 0.05). Other options either describe a two-sided test, test for an increase, or incorrectly define the null for a one-sided decrease test. --- A Black Belt compares the proportion of warranty claims between customers using Version A vs. Version B of a product. She obtains p-values from a two-sample proportion z-test at three significance levels: α = 0.10, 0.05, 0.01. The p-value = 0.032. Which conclusion is most appropriate? A. Fail to reject H0 at all three α levels B. Reject H0 at α = 0.10 and 0.05, but fail to reject at α = 0.01 C. Reject H0 at all three α levels D. Reject H0 only at α = 0.01 Answer: B Reason: A p-value of 0.032 is less than 0.10 and 0.05 but greater than 0.01, so H0 is rejected at α = 0.10 and 0.05 and not rejected at α = 0.01, indicating evidence of different proportions at the less stringent significance levels. Other options misinterpret the p-value relative to the stated α levels. --- A Black Belt constructs a 95% confidence interval for the difference in proportions of on-time delivery between two suppliers: ( \hat{p}1 - \hat{p}2 = 0.04 ) with 95% CI (−0.01, 0.09). How should this be interpreted in terms of a two-sided hypothesis test of equal proportions at α = 0.05? A. Reject H0; Supplier 1 is significantly better than Supplier 2 B. Reject H0; the suppliers have significantly different on-time performance C. Fail to reject H0; no statistically significant difference detected D. Reject H0; Supplier 2 is significantly better than Supplier 1 Answer: C Reason: The 95% confidence interval for the difference includes 0, so at α = 0.05 in a two-sided test there is insufficient evidence to conclude that the suppliers’ proportions differ. The point estimate suggests a small advantage for Supplier 1, but it is not statistically significant. Other options claim a significant difference or an advantage for one supplier when the interval including 0 indicates that such claims are not supported at the 5% level.

23h 59m 59s

🔥 Flash Sale -50% on Mock exams ! Use code 6sigmatool50 – Offer valid for 24 hours only! 🎯

3.5.7 One and Two Sample Proportion