Real-world applications of Bayes' theorem: Learning from evidence and updating beliefs
Bayesian inference is the process of updating our beliefs about the world as we gather new evidence. It's how rational agents should think when faced with uncertainty!
H = Hypothesis, E = Evidence
Bayesian inference is the optimal way to learn from data. It's used in: medical diagnosis, spam filters, machine learning, robotics, autonomous vehicles, weather forecasting, and more!
During the COVID-19 pandemic, you take a rapid antigen test. It's positive. What's the probability you actually have COVID-19? The answer might surprise you!
1% of population has COVID
95% true positive rate
5% false positive rate
We want to find: P(COVID | +) = Probability of COVID given positive test
Use the law of total probability:
5.9% of all people (sick and healthy) will test positive
Even with a positive test, there's only a 16.1% chance of having COVID!
Key insight: Most positive tests come from false positives, not true disease cases!
Bottom line: Of every 5.9 positive tests, only 0.95 (16.1%) come from actual COVID cases. The remaining 4.95 (83.9%) are false positives from the 99% healthy population! This is why rare diseases make positive tests unreliable.
Adjust the parameters and see how the posterior probability changes!
Step 1: Calculate P(+)
Step 2: Apply Bayes' Theorem
P(COVID | Positive Test)
Try this: Increase the COVID prevalence to 50% and observe how the posterior probability changes dramatically! With common diseases, positive tests are much more reliable.
Note: Sensitivity P(+|COVID) and False Positive P(+|ยฌCOVID) are independent parameters. They don't need to sum to 1 - each measures different aspects of test performance.
You open your email and see this subject line:
"๐ CONGRATULATIONS! You Won FREE iPhone - CLICK HERE to Claim!"
Should you trust this email? Bayesian spam filters (like Gmail's) analyze word patterns to protect you from scams!
Spam filters learn from thousands of emails. They ask: "How often does each word appear in spam vs. legitimate emails?" This creates a "word fingerprint" that identifies suspicious patterns.
What these numbers mean:
What these numbers mean:
What these numbers mean:
of emails are SPAM
P(Spam) = 0.40are LEGITIMATE
P(Ham) = 0.60Based on historical email data, we know that 4 out of 10 emails are typically spam.
In reality, words in emails are often dependent (e.g., "FREE" and "WIN" often appear together). But Naive Bayes makes a simplifying assumption: words appear independently.
Instead of calculating:
We simplify to:
= 0.80 ร 0.70 ร 0.65 = 0.364
Why it works: Even though the assumption is "naive," it performs remarkably well for text classification and is very fast to compute!
Given that we see "FREE", "WIN", and "CLICK" together, what's the probability this email is spam vs. legitimate? We use Bayes' theorem to weigh the evidence!
How likely are these words in spam?
36.4% chance of seeing this word combo in spam
Multiply by prior belief:
Weighted spam likelihood
How likely are these words in legitimate emails?
0.075% chance of seeing this word combo in legitimate emails
Multiply by prior belief:
Weighted legitimate likelihood
Total probability of observing these words:
Probability it's SPAM:
Probability it's HAM:
99.7% confidence this email is spam. The combination of "FREE", "WIN", and "CLICK" is extremely indicative of spam emails. These words are much more common in spam than in legitimate emails.
High confidence spam
Very unlikely legitimate
Bayesian spam filters learn from examples. They track word frequencies in spam vs. ham emails, then use Bayes' theorem to classify new emails. Modern filters use thousands of features (words, phrases, metadata) and achieve >99% accuracy!
Your autonomous car is driving through a parking garage.
GPS says you're near the entrance, but cameras see blue walls (typical of the back section).
Which sensor should you trust? Neither is perfect! Bayesian sensor fusion combines both for accurate localization!
GPS gives a rough estimate of your general area, while cameras provide detailed visual information about your immediate surroundings. Neither is perfect, but together they give accurate localization!
GPS gives a general area estimate
GPS says you're in this general vicinity:
GPS Insight: Satellite positioning gives rough area estimates but can be inaccurate in enclosed spaces like parking garages.
Camera sees specific wall colors
Camera detects: "Blue wall visible"
How likely is this at each location?
Camera Insight: Visual sensors provide detailed local information but can be affected by lighting and obstructions.
GPS says you're probably at the entrance, but camera sees blue walls. Bayes' theorem combines these conflicting signals to give the most accurate location estimate!
Likelihood ร Prior for each location:
Location A (Entrance):
GPS said 50% likely, camera says 10% chance of blue wall
Location B (Back Section):
GPS said 30% likely, camera says 80% chance of blue wall
Location C (Middle Area):
GPS said 20% likely, camera says 40% chance of blue wall
Total evidence strength:
Sum of all raw scores
Final Location Probabilities:
Normalization: Divide each raw score by total to get valid probabilities that sum to 100%.
GPS thought Location A was most likely
Camera evidence completely changed our belief!
The camera's visual evidence overrode GPS! Even though GPS was more confident about Location A (50%), the camera's strong signal for blue walls (80% at Location B) completely shifted our belief to Location B (64.9%).
Key Insight: Multiple sensors working together can be more accurate than any single sensor alone. This is why self-driving cars use many sensors!
Notice how the posterior (green) differs from the prior (blue) after incorporating camera evidence
Experiment with different sensor characteristics and see how they affect localization accuracy!
Posterior Probability
Posterior Probability
Posterior Probability
Individual sensors are imperfect, but when combined intelligently using Bayesian methods, they create systems that are more reliable than any single sensor alone. This is why modern autonomous systems use 10+ different sensors working together!
Bayesian inference is the mathematically optimal way to learn from data.
It's the foundation of modern AI, from medical diagnosis to spam filters to self-driving cars.
Every time you see new evidence, ask: "How should this update my beliefs?"