Week3 (2026.03.17) - Probability Theory and Statistics
Shared on March 18, 2026
CSE30301: Basic Math for AI – Quiz‑Ready Summary
1. Recap: Probability & Statistics
-
Sample Mean
- ( \bar{X}=\frac{1}{n}\sum_{i=1}^{n}X_i )
- Unbiased: (E[\bar{X}]=\mu)
- As (n\to\infty), (\bar{X}\to\mu) (Law of Large Numbers)
-
Sample Variance
- (S^2=\frac{1}{n-1}\sum_{i=1}^{n}(X_i-\bar{X})^2)
- Unbiased: (E[S^2]=\sigma^2)
2. Central Limit Theorem (CLT)
- For i.i.d. (X_i) with (E[X_i]=\mu,;Var[X_i]=\sigma^2<\infty):
[ \sqrt{n}\left(\bar{X}-\mu\right)\xrightarrow{d}N(0,\sigma^2) ] - Holds regardless of original distribution (illustrated with Bernoulli, Uniform, and various distributions).
3. Algorithm Comparison (Illustrative Example)
- Two algorithms A and B produce random variables (X) and (Y).
- Goal: determine if (E[X] < E[Y]) (or vice‑versa).
- Approach:
- Sampling → draw (X_1,\dots,X_n) and (Y_1,\dots,Y_n).
- Compute sample means (\bar{X}) and (\bar{Y}).
- Use statistical inference (hypothesis test or confidence interval).
4. Statistical Testing
4.1 Hypotheses
- Null (H_0): No difference (e.g., (E[X]=E[Y])).
- Alternative (H_1): Difference exists (e.g., (E[X]<E[Y])).
4.2 Test Statistic (Z‑test)
- Assume normal population (N(\mu,\sigma^2)) and known (\sigma).
- (Z=\frac{\bar{X}-\mu_0}{\sigma/\sqrt{n}})
- Under (H_0): (Z\sim N(0,1)).
4.3 p‑Value
- Two‑sided: (p=2[1-\Phi(|Z|)]).
- One‑sided (e.g., (H_0:\mu\le\mu_0) vs (H_1:\mu>\mu_0)): (p=1-\Phi(Z)).
- (\Phi) = standard normal CDF.
4.4 Decision Rule
- Significance level (\alpha) (default (0.05)).
- Reject (H_0) if (p<\alpha); otherwise fail to reject.
4.5 Power & Sample Size
- Power (=1-\beta) (probability of correctly rejecting (H_0)).
- For desired power, solve for (n) using the non‑centrality parameter derived from (\mu_1-\mu_0).
5. Confidence Intervals
- For (\mu) with known (\sigma):
[ \bar{X}\pm z_{\alpha/2}\frac{\sigma}{\sqrt{n}} ] - Interpretation: 95% of such intervals will contain the true (\mu) if many samples are taken.
6. Practical Workflow for Algorithm Comparison
- Collect Samples from both algorithms.
- Compute (\bar{X}), (\bar{Y}), and estimated variances.
- Select Test:
- If (\sigma) known → Z‑test.
- If (\sigma) unknown → t‑test (not detailed in slides but implied).
- Calculate test statistic, p‑value.
- Decide using (\alpha).
- Report confidence interval for the difference (\bar{X}-\bar{Y}).
7. Quiz Details
- Date: Thursday, March 19th (beginning of class).
- Coverage: All material from the first lecture through today’s lecture, including Probability Theory and Statistics.
- Timing: Arrive on time; quiz starts immediately.