Back to Lecture 10

Bayesian Inference in Practice

Real-world applications of Bayes' theorem: Learning from evidence and updating beliefs

What is Bayesian Inference?

The Core Idea

Bayesian inference is the process of updating our beliefs about the world as we gather new evidence. It's how rational agents should think when faced with uncertainty!

The Formula
P(H | E) =
P(E | H) ยท P(H)
P(E)

H = Hypothesis, E = Evidence

Components
  • Prior P(H): Initial belief before evidence
  • Likelihood P(E|H): How likely is evidence if hypothesis is true?
  • Evidence P(E): Total probability of observing evidence
  • Posterior P(H|E): Updated belief after seeing evidence
Why It Matters

Bayesian inference is the optimal way to learn from data. It's used in: medical diagnosis, spam filters, machine learning, robotics, autonomous vehicles, weather forecasting, and more!

Example 1: Medical Diagnosis

The Scenario

During the COVID-19 pandemic, you take a rapid antigen test. It's positive. What's the probability you actually have COVID-19? The answer might surprise you!

Given Information
๐Ÿฆ 
COVID Prevalence
P(COVID) = 0.01

1% of population has COVID

โœ…
Test Sensitivity
P(+ | COVID) = 0.95

95% true positive rate

โŒ
False Positive Rate
P(+ | ยฌCOVID) = 0.05

5% false positive rate

Step-by-Step Solution
1
Identify What We're Looking For

We want to find: P(COVID | +) = Probability of COVID given positive test

Question: If test is positive, what's the chance I have COVID?
2
Calculate P(+) - Total Probability of Positive Test

Use the law of total probability:

P(+) = P(+ | COVID) ยท P(COVID) + P(+ | ยฌCOVID) ยท P(ยฌCOVID)
P(+) = 0.95 ร— 0.01 + 0.05 ร— 0.99
P(+) = 0.0095 + 0.0495
P(+) = 0.059

5.9% of all people (sick and healthy) will test positive

3
Apply Bayes' Theorem
P(COVID | +) =
P(+ | COVID) ยท P(COVID)
P(+)
P(COVID | +) =
0.95 ร— 0.01
0.059
P(COVID | +) =
0.0095
0.059
P(COVID | +) = 0.161 = 16.1%
Result

Even with a positive test, there's only a 16.1% chance of having COVID!

Why So Low? The Base Rate Fallacy

Key insight: Most positive tests come from false positives, not true disease cases!

Out of 100 people:
โ€ข 1 has COVID
โ€ข 99 are healthy
True positives:
โ€ข 1 ร— 95% = 0.95
(actual COVID cases)
False positives:
โ€ข 99 ร— 5% = 4.95
(healthy people)

Bottom line: Of every 5.9 positive tests, only 0.95 (16.1%) come from actual COVID cases. The remaining 4.95 (83.9%) are false positives from the 99% healthy population! This is why rare diseases make positive tests unreliable.

Interactive Medical Diagnosis Calculator

Adjust the parameters and see how the posterior probability changes!

COVID Prevalence P(COVID) 1.0%
Sensitivity P(+|COVID) 95%
False Positive P(+|ยฌCOVID) 5%
Calculation Breakdown

Step 1: Calculate P(+)

P(+) = 0.95ร—0.01 + 0.05ร—0.99 = 0.059

Step 2: Apply Bayes' Theorem

P(D|+) = (0.95ร—0.01) / 0.059 = 0.161
Posterior Probability

P(COVID | Positive Test)

16.1%

Try this: Increase the COVID prevalence to 50% and observe how the posterior probability changes dramatically! With common diseases, positive tests are much more reliable.

Note: Sensitivity P(+|COVID) and False Positive P(+|ยฌCOVID) are independent parameters. They don't need to sum to 1 - each measures different aspects of test performance.

Example 2: Spam Email Filter

๐Ÿ“ง 1. The Suspicious Email

๐Ÿ“ง Your Inbox: Suspicious Email Alert

You open your email and see this subject line:
"๐ŸŽ‰ CONGRATULATIONS! You Won FREE iPhone - CLICK HERE to Claim!"

Should you trust this email? Bayesian spam filters (like Gmail's) analyze word patterns to protect you from scams!

๐Ÿ” 2. Word Analysis: The "Spam Fingerprint"

๐Ÿ” Breaking Down the Email: "FREE", "WIN", "CLICK"
How Spam Filters Think

Spam filters learn from thousands of emails. They ask: "How often does each word appear in spam vs. legitimate emails?" This creates a "word fingerprint" that identifies suspicious patterns.

"FREE"
๐Ÿšซ
SPAM
80%
โœ…
HAM
10%

What these numbers mean:

  • Out of every 10 spam emails: 8 contain "FREE", 2 don't
  • Out of every 10 legitimate emails: 1 contains "FREE", 9 don't
  • This makes "FREE" a strong spam indicator!
"WIN"
๐Ÿšซ
SPAM
70%
โœ…
HAM
5%

What these numbers mean:

  • Out of every 10 spam emails: 7 contain "WIN", 3 don't
  • Out of every 10 legitimate emails: 1 contains "WIN", 9 don't
  • Very suspicious - "WIN" appears rarely in real emails!
"CLICK"
๐Ÿšซ
SPAM
65%
โœ…
HAM
15%

What these numbers mean:

  • Out of every 10 spam emails: 7 contain "CLICK", 3 don't
  • Out of every 10 legitimate emails: 2 contain "CLICK", 8 don't
  • Common in both, but still favors spam classification
What We Know About Email Traffic
๐Ÿšซ
40%

of emails are SPAM

P(Spam) = 0.40
โœ…
60%

are LEGITIMATE

P(Ham) = 0.60

Based on historical email data, we know that 4 out of 10 emails are typically spam.

๐Ÿค” 3. The "Naive" Assumption

๐Ÿค” The "Naive" Trick: Treating Words as Independent

In reality, words in emails are often dependent (e.g., "FREE" and "WIN" often appear together). But Naive Bayes makes a simplifying assumption: words appear independently.

Instead of calculating:

P(FREE, WIN, CLICK | Spam) = [complex joint probability]

We simplify to:

P(FREE, WIN, CLICK | Spam) = P(FREE|Spam) ร— P(WIN|Spam) ร— P(CLICK|Spam)

= 0.80 ร— 0.70 ร— 0.65 = 0.364

Why it works: Even though the assumption is "naive," it performs remarkably well for text classification and is very fast to compute!

โš–๏ธ 4. Bayesian Analysis

โš–๏ธ Bayesian Analysis: Spam vs Legitimate
The Big Question

Given that we see "FREE", "WIN", and "CLICK" together, what's the probability this email is spam vs. legitimate? We use Bayes' theorem to weigh the evidence!

Step 1: If It's SPAM...

How likely are these words in spam?

P(FREE, WIN, CLICK | Spam) = 0.80 ร— 0.70 ร— 0.65
= 0.364

36.4% chance of seeing this word combo in spam

Multiply by prior belief:

P(Words|Spam) ร— P(Spam) = 0.364 ร— 0.40
= 0.1456

Weighted spam likelihood

Step 2: If It's LEGITIMATE...

How likely are these words in legitimate emails?

P(FREE, WIN, CLICK | Ham) = 0.10 ร— 0.05 ร— 0.15
= 0.00075

0.075% chance of seeing this word combo in legitimate emails

Multiply by prior belief:

P(Words|Ham) ร— P(Ham) = 0.00075 ร— 0.60
= 0.00045

Weighted legitimate likelihood

3
Normalize to Get Posterior Probabilities

Total probability of observing these words:

P(Words) = 0.1456 + 0.00045 = 0.14605

Probability it's SPAM:

P(Spam | Words) = 0.1456 / 0.14605 = 99.7%

Probability it's HAM:

P(Ham | Words) = 0.00045 / 0.14605 = 0.3%
Verdict: SPAM!

99.7% confidence this email is spam. The combination of "FREE", "WIN", and "CLICK" is extremely indicative of spam emails. These words are much more common in spam than in legitimate emails.

๐Ÿ“Š 5. Results & Key Insights

Probability Comparison
๐Ÿšซ SPAM
99.7%

High confidence spam

โœ… HAM (Legitimate)
0.3%

Very unlikely legitimate

Key Insights

Bayesian spam filters learn from examples. They track word frequencies in spam vs. ham emails, then use Bayes' theorem to classify new emails. Modern filters use thousands of features (words, phrases, metadata) and achieve >99% accuracy!

Example 3: Robot Sensor Fusion

๐Ÿš— 1. Self-Driving Car Localization

๐Ÿš— The Self-Driving Car Challenge

Your autonomous car is driving through a parking garage.
GPS says you're near the entrance, but cameras see blue walls (typical of the back section).

Which sensor should you trust? Neither is perfect! Bayesian sensor fusion combines both for accurate localization!

๐Ÿ“ก 2. Understanding GPS vs Camera Sensors

๐Ÿญ Parking Garage Layout: 3 Possible Locations
How Sensors Work

GPS gives a rough estimate of your general area, while cameras provide detailed visual information about your immediate surroundings. Neither is perfect, but together they give accurate localization!

๐Ÿญ Parking Garage: 3 Distinct Areas
๐Ÿšช
Location A
Entrance Area
(Gray walls)
๐Ÿ”ต
Location B
Back Section
(Blue walls)
๐ŸŸก
Location C
Middle Area
(Yellow walls)
GPS Sensor: Rough Location Estimate
๐Ÿ“ก

GPS gives a general area estimate

GPS says you're in this general vicinity:

50%
๐Ÿšช
Location A
30%
๐Ÿ”ต
Location B
20%
๐ŸŸก
Location C

GPS Insight: Satellite positioning gives rough area estimates but can be inaccurate in enclosed spaces like parking garages.

Camera Sensor: Detailed Visual Evidence
๐Ÿ“ท

Camera sees specific wall colors

Camera detects: "Blue wall visible"

How likely is this at each location?

10%
๐Ÿšช
Location A
(Gray walls)
80%
๐Ÿ”ต
Location B
(Blue walls)
40%
๐ŸŸก
Location C
(Yellow walls)

Camera Insight: Visual sensors provide detailed local information but can be affected by lighting and obstructions.

โš–๏ธ 3. Bayesian Sensor Fusion

๐Ÿ”„ Combining GPS + Camera: The Power of Multiple Sensors
The Fusion Question

GPS says you're probably at the entrance, but camera sees blue walls. Bayes' theorem combines these conflicting signals to give the most accurate location estimate!

Step 1: Calculate Raw Scores

Likelihood ร— Prior for each location:

Location A (Entrance):

P(Blue|A) ร— P(A) = 0.10 ร— 0.50
= 0.050

GPS said 50% likely, camera says 10% chance of blue wall

Location B (Back Section):

P(Blue|B) ร— P(B) = 0.80 ร— 0.30
= 0.240

GPS said 30% likely, camera says 80% chance of blue wall

Location C (Middle Area):

P(Blue|C) ร— P(C) = 0.40 ร— 0.20
= 0.080

GPS said 20% likely, camera says 40% chance of blue wall

Step 2: Normalize to Get Final Probabilities

Total evidence strength:

P(Blue Wall) = 0.050 + 0.240 + 0.080
= 0.370

Sum of all raw scores

Final Location Probabilities:

A: 0.050 รท 0.370
= 13.5%
B: 0.240 รท 0.370
= 64.9%
C: 0.080 รท 0.370
= 21.6%

Normalization: Divide each raw score by total to get valid probabilities that sum to 100%.

Before Fusion: GPS Alone
50%
๐Ÿšช
Location A
30%
๐Ÿ”ต
Location B
20%
๐ŸŸก
Location C

GPS thought Location A was most likely

After Fusion: GPS + Camera
13.5%
๐Ÿšช
Location A
64.9%
๐ŸŽฏ
MOST LIKELY!
21.6%
๐ŸŸก
Location C

Camera evidence completely changed our belief!

๐ŸŽฏ Sensor Fusion Success!

The camera's visual evidence overrode GPS! Even though GPS was more confident about Location A (50%), the camera's strong signal for blue walls (80% at Location B) completely shifted our belief to Location B (64.9%).

Key Insight: Multiple sensors working together can be more accurate than any single sensor alone. This is why self-driving cars use many sensors!

๐Ÿ“Š 4. Interactive Visualization & Experimentation

Belief Update Visualization

Notice how the posterior (green) differs from the prior (blue) after incorporating camera evidence

๐ŸŽฎ Interactive Sensor Fusion Lab

Experiment with different sensor characteristics and see how they affect localization accuracy!

P(Blue | Location A) 10%
P(Blue | Location B) 80%
P(Blue | Location C) 40%
Location A
13.5%

Posterior Probability

Location B
64.9%

Posterior Probability

Location C
21.6%

Posterior Probability

๐Ÿš€ 5. Real-World Impact & Applications

Where Bayesian Sensor Fusion Makes a Difference
Autonomous Vehicles
  • Camera + Radar + LiDAR + GPS for precise positioning
  • Handles GPS outages in tunnels/cities
  • Combines visual lane detection with radar distance measurement
  • Essential for safe self-driving
Smartphones & GPS
  • GPS + WiFi + Cell towers for accurate location
  • Works indoors where GPS alone fails
  • Powers navigation apps and location services
  • Enables location-based advertising
Drones & Robotics
  • IMU + GPS + Visual odometry for stable flight
  • Maintains position during GPS loss
  • Combines accelerometer data with camera input
  • Critical for autonomous navigation
Industrial Applications
  • Multiple sensors for quality control and inspection
  • Robotic arms combining vision and touch sensors
  • Medical imaging combining different modalities
  • Weather forecasting with multiple data sources

The Power of Sensor Fusion

Individual sensors are imperfect, but when combined intelligently using Bayesian methods, they create systems that are more reliable than any single sensor alone. This is why modern autonomous systems use 10+ different sensors working together!

Key Takeaways

What We Learned
  1. Bayesian inference updates beliefs with new evidence
  2. Prior beliefs matter but can be overcome by strong evidence
  3. Rare events (disease, spam) require careful interpretation
  4. Multiple sources of evidence can be combined optimally
  5. Normalization ensures probabilities sum to 1
Practical Tips
  • Always identify prior, likelihood, and evidence
  • Use law of total probability for P(E)
  • Normalize to ensure valid probability distribution
  • Consider all hypotheses that could explain evidence
  • Update iteratively as new evidence arrives
The Power of Bayesian Thinking

Bayesian inference is the mathematically optimal way to learn from data.
It's the foundation of modern AI, from medical diagnosis to spam filters to self-driving cars.
Every time you see new evidence, ask: "How should this update my beliefs?"

Next Steps: Now that you understand Bayesian inference, you're ready to explore more advanced topics like Bayesian networks, machine learning, and probabilistic programming! Back to Lecture 10 โ†’