Key Insight:
Bayes' rule allows us to invert conditional probabilities. If we know P(symptoms | disease),
we can compute P(disease | symptoms) using prior knowledge P(disease).
Medical Diagnosis Example:
Given: P(positive test | disease) = 0.99, P(disease) = 0.001, P(positive test | no disease) = 0.05
Find: P(disease | positive test)
\[ P(D \mid +) = \frac{0.99 \times 0.001}{0.99 \times 0.001 + 0.05 \times 0.999} = \frac{0.00099}{0.05094} \approx 0.019 \]
Interpretation: Only 1.9% chance of disease despite positive test! (due to low base rate)
4. Marginalization (Summing Out)
Marginalization Formula
\[ P(A) = \sum_{b \in B} P(A, b) = \sum_{b} P(A \mid b) \cdot P(b) \]
\[ A \perp B \iff P(A \cap B) = P(A) \cdot P(B) \]
\[ \text{Equivalently: } P(A \mid B) = P(A) \]
Conditional Independence (⭐ Very Important)
\[ A \perp B \mid C \iff P(A \cap B \mid C) = P(A \mid C) \cdot P(B \mid C) \]
\[ \text{Equivalently: } P(A \mid B, C) = P(A \mid C) \]
"A and B are independent given C"
Why it matters:
Independence reduces parameters exponentially!
• Without independence: n binary variables need \(2^n - 1\) parameters
• With independence: only n parameters
• Conditional independence: enables efficient inference in Bayesian networks
Example: Toothache and Catch are conditionally independent given Cavity:
P(Toothache, Catch | Cavity) = P(Toothache | Cavity) × P(Catch | Cavity)
6. Probability Distributions
Joint Distribution
\[ P(X_1, X_2, \ldots, X_n) \]
Complete probability model: specifies probability of every possible state
Marginal Distribution
\[ P(X) = \sum_{y} P(X, y) \]
Probability of subset of variables (others summed out)
Sum of independent random variables approaches normal distribution
Why these matter:
• LLN: Justifies using sample statistics to estimate population parameters
• CLT: Explains why normal distribution appears everywhere in nature
• Both are foundations of machine learning and statistical inference
Key Philosophy:
"Intelligence is not about absolute certainty, but about reasoning optimally under uncertainty."
Logic and probability are complementary, not competing approaches.
12. Common Mistakes to Avoid
❌ Confusion of the Inverse:
P(A | B) ≠ P(B | A) in general
Example: P(spots | measles) ≠ P(measles | spots)
❌ Base Rate Neglect:
Ignoring P(disease) when computing P(disease | symptoms)
Result: Overestimating rare diseases
❌ Assuming Independence:
P(A, B) = P(A) × P(B) only if A and B are independent
Must verify independence, not assume it
❌ Unnormalized Probabilities:
Probabilities must sum to 1
Always normalize: P(A | B) = P(A, B) / P(B)
❌ Conditional Independence Confusion:
A ⊥ B doesn't imply A ⊥ B | C
A ⊥ B | C doesn't imply A ⊥ B
Must check each separately
SE444: Artificial Intelligence | Lecture 10 Cheat Sheet: Uncertainty & Probability
Print this page for quick reference during studying and exams