I. Statistical Inference for Categorical Data
A. Single Proportion ($p$)
| Method | Statistic/Interval Formula | Validity/Assumptions | Key Feature |
|---|---|---|---|
| Wald Test (H$_0: p = p_0$) | $Z = \frac{\hat{p} - p_0}{\sqrt{\hat{p}(1-\hat{p})/n}}$ | $X \ge 5$ and $(n - X) \ge 5$ | Requires large samples |
| Score Test (H$_0: p = p_0$) | $Z = \frac{\hat{p} - p_0}{\sqrt{p_0(1-p_0)/n}}$ | $np_0 \ge 5$ and $n(1-p_0) \ge 5$ | Better small sample properties than Wald |
| Wald CI | $\hat{p} \pm Z_{1-\alpha/2} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$ | Performs well if $n\hat{p}$ and $n\hat{p}(1-\hat{p})$ are very large | Symmetric around $\hat{p}$ |
| Wilson (Score-based) CI | $\frac{(2n\hat{p} + z^2) \pm \sqrt{z^4 + 4nz^2\hat{p}(1-\hat{p})}}{2(n + z^2)}$ (where $z = Z_{1-\alpha/2}$) | Provides better coverage than Wald when $n$ is not large or $p$ is near 0 or 1. | Not symmetric around $\hat{p}$. Wald CI can fall outside $$. |
B. Comparing Two Proportions ($p_1, p_2$)
| Comparison Target | Estimate | Hypothesis Test H$_0: p_1 = p_2$ |
|---|---|---|
| Risk Difference (RD) | $\hat{R}D = \hat{p}_1 - \hat{p}_2$ | Z-test/$\chi^2$ Test: $Z = \frac{\hat{p}_1 - \hat{p}_2}{\sqrt{\hat{p}(1-\hat{p})(1/n_1 + 1/n_2)}} \sim N(0, 1)$ |
| Pooled Proportion | $\hat{p} = \frac{x_1 + x_2}{n_1 + n_2}$ | Equivalence: $\mathbf{Z^2 = \chi^2}$ (same assumptions, identical result). |
| Confidence Interval Method | Formula/Structure | Conditions |
|---|---|---|
| Simple CI for $\mathbf{p_1 - p_2}$ | $\hat{p}_1 - \hat{p}2 \pm Z{1-\alpha/2} \sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}}$ | May be unsafe if $<30$ subjects per group or $\hat{p}$ are close to 0 or 1. |
| Newcombe’s CI (Better Coverage) | Uses Wilson CIs ($l_i, u_i$) for $p_1, p_2$. | Preferred for small samples. |
| Newcombe Lower Limit | $\hat{p}_1 - \hat{p}_2 - \sqrt{(\hat{p}_1 - l_1)^2 + (u_2 - \hat{p}_2)^2}$ | |
| Newcombe Upper Limit | $\hat{p}_1 - \hat{p}_2 + \sqrt{(\hat{p}_2 - l_2)^2 + (u_1 - \hat{p}_1)^2}$ |
C. Measures of Effect (2x2 Table: $a, b, c, d$)
| Measure | Formula ($\hat{p}_1 = a/n_1, \hat{p}_2 = c/n_2$) | Log Variance Estimate $\hat{V}ar(\log \hat{M})$ |
|---|---|---|
| Risk Difference (RD) | $\hat{p}_1 - \hat{p}_2$ | N/A |
| Relative Risk (RR) | $\hat{R}R = \frac{a/n_1}{c/n_2}$ | $\frac{1}{a} - \frac{1}{n_1} + \frac{1}{c} - \frac{1}{n_2}$ |
| Odds Ratio (OR) | $\hat{O}R = \frac{ad}{bc}$ | $\frac{1}{a} + \frac{1}{b} + \frac{1}{c} + \frac{1}{d}$ |
- Key Insight: If the event is rare, OR $\approx$ RR. OR is the primary measure for case-control (retrospective) studies.
- NNT (Number Needed To Treat): $NNT = 1/RD$ (Requires $RD > 0$).
- CI for RR/OR: Calculate CI for $\log(\text{Measure}) = \log(\hat{M}) \pm Z \sqrt{\hat{V}ar(\log \hat{M})}$, then exponentiate the limits: $(\exp(l), \exp(u))$.
II. Chi-Square Tests
- General Validity Rule: Chi-square tests (Goodness-of-Fit, Independence, Homogeneity) are valid based on the Central Limit Theorem (CLT) and require that all Expected Cell Counts ($\mathbf{E_{ij}}$) be $\ge 5$.
- Fisher’s Exact Test: Used for 2x2 tables if one or more expected cell frequencies is less than 5.
| Test | Null Hypothesis (H$_0$) | $\chi^2$ Statistic | Degrees of Freedom ($df$) |
|---|---|---|---|
| Goodness-of-fit | Specified proportions ($p_1, \dots, p_K$) are true. | $\sum_{k=1}^{K} \frac{(O_k - E_k)^2}{E_k}$ | $K - 1$ |
| Independence/Homogeneity (R x C Table) | Variables are independent / Risks are equal ($p_1=p_2$). | $\sum_{i=1}^{r} \sum_{j=1}^{c} \frac{(O_{ij} - E_{ij})^2}{E_{ij}}$ | $(r - 1)(c - 1)$ |
| McNemar (Paired) | $p_{\text{before}} = p_{\text{after}}$. (Used for “before-and-after” designs). | $\chi^2_M = \frac{(b - c)^2}{b + c}$ (where $b, c$ are discordant pairs). | 1 |
| Expected Cell Count | $E_{ij} = \frac{n_i m_j}{n}$ (Row total $\times$ Column total / Grand total) |
Examination Cheatsheet (Back Side)
III. Stratified Analysis & Confounding Control
- Mantel-Haenszel (MH) Methods: Used to combine results across $K$ strata (confounder levels) and control the confounding variable. Tests H$_0$: Adjusted OR = 1.
- MH OR Estimator: $$\hat{OR}_{MH} = \frac{\sum_k a_k d_k / n_k}{\sum_k b_k c_k / n_k}$$
- MH RR Estimator: $$\hat{RR}{MH} = \frac{\sum_k a_k n{2k} / n_k}{\sum_k c_k n_{1k} / n_k}$$
IV. Sample Size Determination
| Scenario | Margin of Error ($E$) / Power ($1-\beta$) | Required Sample Size ($n$) Formula |
|---|---|---|
| One Group Proportion | Estimation with margin $E$ (CI). | $n = p(1-p) \left(\frac{Z_{1-\alpha/2}}{E}\right)^2$. (Use $p=0.5$ for maximum $n$). |
| Two Group Proportions | Comparing $p_1$ vs. $p_2$ (Equal $n_1=n_2=n$). | $n = \left(\frac{Z_{1-\alpha/2} \sqrt{2p(1-p)} + Z_{1-\beta} \sqrt{p_1(1-p_1) + p_2(1-p_2)}}{p_1 - p_2}\right)^2$ where $p = (p_1 + p_2)/2$. |
| Paired Proportions (McNemar) | Comparing proportions based on discordant pairs ($p_b, p_c$). | $n = \left(\frac{Z_{1-\alpha/2} \sqrt{p_c + p_b} + Z_{1-\beta} \sqrt{p_c + p_b - (p_c - p_b)^2}}{p_c - p_b}\right)^2$ |
V. Analysis of Variance (ANOVA)
A. One-Way ANOVA
- Purpose: Test if $k > 2$ population means are equal ($H_0: \mu_1 = \cdots = \mu_k$).
- Key Conditions: $k$ independent populations, random samples, and Equal Population Variances ($\sigma^2$).
- Variance Decomposition: Total Sum of Squares = Within (Error) + Between (Treatment). $$SST = SS_W + SS_B$$
- Calculations & F-Test: | Source | Sum of Squares ($SS$) | $df$ | Mean Square ($s^2$) | F-Ratio | | :— | :— | :— | :— | :— | | Between | $SS_B = \sum_{j} n_j(\bar{y}j - \bar{y})^2$ | $k - 1$ | $s^2_b = SS_B / (k-1)$ | $F = s^2_b / s^2_w$ | | Within (Error) | $SS_W = \sum{j} \sum_{i} (y_{ij} - \bar{y}j)^2$ | $n - k$ | $s^2_w = SS_W / (n-k)$ | $\sim F{k-1, n-k}$ |
B. Repeated Measures ANOVA
- Design: One sample of $n$ subjects, with $k$ repeated measurements per subject.
- Advantage: Increased power by removing random variations between subjects. Accounts for dependency among measurements.
- Hypotheses: $H_0: \mu_1 = \mu_2 = \cdots = \mu_k$ (Treatment means are equal).
- Error Degrees of Freedom: $df_{\text{Error}} = (n-1)(k-1)$.
- F-Ratio Distribution: $F = s^2_b / s^2_w \sim F_{df_1=k-1, df_2=(n-1)(k-1)}$.
VI. Multiple Comparisons (Controlling Familywise Error Rate, FWE)
- FWE Definition: The probability of making at least one Type I error ($\alpha$) when performing $n$ comparisons. FWE $= 1 - (1 - \alpha)^n$.
- Complex Contrast: $C = \sum_{j=1}^k c_j \mu_j$ where $\sum_{j=1}^k c_j = 0$.
| Procedure | Primary Use Case | Conservation Level (Power) | Individual Test Level ($\alpha^*$) |
|---|---|---|---|
| Bonferroni Correction | Any endpoint; not limited to pairwise. | Conservative. | $\alpha^* = \alpha / \binom{k}{2} = \frac{2\alpha}{k(k-1)}$ (for pairwise tests). |
| Tukey’s HSD | Pairwise comparisons only (Gaussian outcomes). | Higher power than Scheffe. | Compares mean difference to critical value HSD. |
| Scheffe’s Procedure | Complex contrasts (Gaussian outcomes). | Most conservative. | Requires a modified F-test for contrasts. |