The Most Important Formula in Probabilistic AI
"Invert conditional probabilities"
POSTERIOR
What we WANT
P(Disease|+Test)
LIKELIHOOD
What we KNOW
P(+Test|Disease)
PRIOR
Base Rate
P(Disease)=1%
EVIDENCE
Normalizer
P(+Test) Total
Bayesian Inference Flow:
Scenario: Testing for a rare disease
Prior
1% have disease
Absolute probability
BEFORE seeing test
Likelihood
95% sensitivity
Test accuracy
IF disease present
Evidence
5.9% test positive
Overall positive rate
(all causes)
Posterior
16.1% chance!
Updated belief
AFTER seeing test
β οΈ Key Insight: Despite 95% accurate test, only 16.1% chance of disease because base rate is low (1%)!
Calculation: (0.95 Γ 0.01) / 0.059 = 0.161
π Understanding the Bayesian Reasoning:
π΅ Prior (1%): Before any test, we know only 1 in 100 people have this disease (the base rate in the population).
π£ Likelihood (95%): We know the test correctly identifies 95% of sick patients (but what about healthy ones?).
π’ Evidence (5.9%): Overall, 5.9% of all people test positive (mix of true positives from 1% sick + false positives from 99% healthy).
π΄ Posterior (16.1%): After testing positive, we update our belief to 16.1% - much higher than 1%, but still mostly likely a false positive!
Scenario: Sky is cloudy. What's probability of rain?
Prior
10% of days rain
Absolute probability
BEFORE seeing sky
Likelihood
80% rainy days cloudy
Cloud probability
IF raining
Evidence
35% of days cloudy
Overall cloudy rate
(observed now)
Posterior
22.9% chance rain
Updated belief
AFTER seeing clouds
βοΈ Key Insight: Cloudy sky increases rain probability from 10% (base rate) to 22.9%!
Calculation: (0.80 Γ 0.10) / 0.35 = 0.229
π Understanding the Bayesian Reasoning:
π΅ Prior (10%): In Riyadh, historically 10% of all days have rain (the base rate from past data).
π£ Likelihood (80%): We know that when it rains, the sky is cloudy 80% of the time (clouds are common with rain).
π’ Evidence (35%): Overall, 35% of all days are cloudy (whether it rains or not - clouds happen for many reasons).
π΄ Posterior (22.9%): Now that we see clouds, we update our belief: rain is more likely than the 10% base rate, but still not probable!
Bayes' rule is the most important formula in probabilistic AI. It allows us to:
Conditional probability is the probability of A happening, given that B has already happened.
Read as: "Probability of A given B"
Not from axioms - from DATA! Unlike the three axioms of probability (which are assumptions), specific probabilities like P(Disease)=1% or P(Cloudy|Rain)=80% come from observations: historical records, clinical studies, weather data, or expert knowledge. Bayes' rule then combines these observed probabilities to make inferences.
Example: Weather in Riyadh
Data Source: Table values from 365 days of weather observations in Riyadh. These are empirical probabilities (observed frequencies), not theoretical.
| βοΈ Sunny | βοΈ Cloudy | Total | |
|---|---|---|---|
| π¬οΈ Windy | 0.15 | 0.25 | 0.40 |
| π Calm | 0.35 | 0.25 | 0.60 |
| Total | 0.50 | 0.50 | 1.00 |
P(Windy | Sunny) = ?
Step 1: Find P(Windy AND Sunny) from table = 0.15
Step 2: Find P(Sunny) from total column = 0.50
Step 3: Apply formula:
"30% of sunny days are windy"
P(Sunny | Windy) = ?
Step 1: Find P(Sunny AND Windy) from table = 0.15
Step 2: Find P(Windy) from total row = 0.40
Step 3: Apply formula:
"37.5% of windy days are sunny"
P(A|B) β P(B|A) in general! Knowing B occurred changes our belief about A, but the degree of change is different in each direction. This is why we need Bayes' rule.
Two events A and B are independent if knowing that one event occurred does not change the probability of the other event.
"Learning B happened doesn't affect probability of A"
Definition:
Examples:
Key Property:
Events don't influence each other!
Definition:
Examples:
Key Property:
Events influence each other!
β If INDEPENDENT:
Example: P(Heads on coin 1 AND Heads on coin 2)
= 0.5 Γ 0.5 = 0.25
β If DEPENDENT:
Example: P(Cloudy AND Rain)
β P(Cloudy) Γ P(Rain)
Must use P(Rain|Cloudy)!
β If INDEPENDENT:
Example: P(Heads on coin 2 | Heads on coin 1)
= P(Heads on coin 2) = 0.5
Coin 1 doesn't affect coin 2!
β If DEPENDENT:
Example: P(Rain | Cloudy) β P(Rain)
If cloudy, rain is more likely!
Clouds affect rain probability!
General Formula (always true):
β If INDEPENDENT:
Example: P(Heads on coin 1 OR Heads on coin 2)
= 0.5 + 0.5 - (0.5Γ0.5) = 0.75
β If DEPENDENT:
Example: P(Cloudy OR Rain)
Must find P(Cloudy β© Rain) from data,
cannot use P(Cloudy)ΓP(Rain)!
Select a scenario and see if events are independent by comparing P(A|B) with P(A).
Bayesian Networks and Probabilistic Graphical Models explicitly model dependencies between variables. Understanding independence:
We want to flip a conditional probability: given P(B|A), find P(A|B). This is like converting "If it rains, there are clouds" into "If there are clouds, will it rain?"
Follow the mathematical steps from conditional probability to Bayes' Rule
By definition, the probability of A given B is:
"What fraction of times B happens does A also happen?"
We can also write the conditional probability in reverse:
"A β© B is the same as B β© A (intersection is symmetric)"
Multiply both sides of Step 2 by P(A):
"This is the multiplication rule for dependent events!"
Replace P(A β© B) in Step 1 with the expression from Step 3:
From Step 1:
π This is Bayes' Rule!
We've successfully inverted the conditional probability!
Let's apply each step with actual values
Given information:
Goal:
Find: P(Disease|+Test) = ?
If test is positive, what's the probability of actually having the disease?
Apply Bayes' Rule:
Result: Only 16.1% chance of disease despite positive test!
Bayes' rule lets us invert conditional probabilities:
β What we can easily observe:
π― What we actually need:
This "probability inversion" is fundamental to diagnosis, prediction, machine learning, and AI reasoning!
What we believed before seeing evidence B
How likely is evidence B if A is true?
Total probability of seeing evidence B
Updated belief after seeing evidence B
A patient tests positive for a disease. What's the probability they actually have it? The answer is often counterintuitive - Bayes' rule reveals the truth!
Adjust the sliders to see how base rates dramatically affect diagnosis!
Law of Total Probability
Posterior Probability
Adjust sliders to calculate
Even with 95% accurate test, if disease is rare (1%), positive test only means ~16% chance of disease!
This is because false positives from the 99% healthy population outnumber true positives from the 1% sick population.
Base rates matter! This is why Bayes' rule is essential.
It's 2 AM. Your home alarm blares! π¨
Your heart races - is it a burglar π¦ΉββοΈ or just an earthquake π shaking the house?
Bayes' rule helps you figure out what really caused that alarm!
Real alarms don't just say "burglar!" - they can be triggered by many causes.
Bayes' rule lets us update our beliefs about multiple possible causes when we get evidence.
What % of time does alarm go off?
Updated beliefs after hearing alarm
Adjust sliders to see
Adjust sliders to see
Multiple Causes:
Alarms don't just detect burglars - they respond to any trigger. Bayes helps us disentangle competing explanations.
Evidence Updates:
The alarm is evidence that shifts our beliefs, but doesn't give perfect certainty about what caused it.
"Bayes' rule is the mathematical foundation for how AI systems update beliefs with evidence. Every time a spam filter learns, a medical AI diagnoses, or a robot localizes itself, Bayes' rule is working behind the scenes."