Bayes’ Formula

Mathematical Statistics

Samir Orujov, PhD

ADA University, School of Business

Information Communication Technologies Agency, Statistics Unit

2025-10-18

Overview

Today’s Journey

🎯 Bayes’ Formula Foundation
📊 Total Probability Law
🏥 Real-world Applications
🔄 Updating Beliefs with Evidence

Learning Objectives

✅ Master Bayes’ Theorem applications
🧮 Solve complex probability problems
🔄 Update probabilities with new evidence
⚖️ Apply odds and likelihood ratios

Bayes’ Formula Foundation

The Foundation

Basic Setup: Let \(E\) and \(F\) be events. Then \(E = EF \cup EF^c\)

Total Probability Law: \[P(E) = P(EF) + P(EF^c) = P(E \mid F)P(F) + P(E \mid F^c)P(F^c)\]

Expanding: \[P(E) = P(E \mid F)P(F) + P(E \mid F^c)(1 - P(F))\]

Venn Diagram Visualization

Key Insight: Event \(E\) can happen in two mutually exclusive ways:

With \(F\) occurring: \(EF\)
With \(F\) not occurring: \(EF^c\)

Bayes’ Formula Definition

Definition: Bayes’ Formula

Given events \(E\) and \(F\) with \(P(E) > 0\):

\[\boxed{P(F \mid E) = \frac{P(E \mid F)P(F)}{P(E)}}\]

Or equivalently:

\[\boxed{P(F \mid E) = \frac{P(E \mid F)P(F)}{P(E \mid F)P(F) + P(E \mid F^c)P(F^c)}}\]

💡 Key Insight: Bayes’ formula allows us to “reverse” conditional probabilities!

Example 1: Insurance Claims Analysis

Problem: Insurance company divides people into accident-prone and not accident-prone.

Given Data:

Accident-prone: \(P(\text{accident in 1 year}) = 0.4\)
Not accident-prone: \(P(\text{accident in 1 year}) = 0.2\)
30% of population is accident-prone

Question: What’s the probability a new policyholder has an accident within a year?

Example 1: Solution Process

Define Events:

\(A\) = accident occurs
\(P\) = person is accident-prone

Given Information:

\(P(P) = 0.3\), so \(P(P^c) = 0.7\)
\(P(A \mid P) = 0.4\)
\(P(A \mid P^c) = 0.2\)

Solution using Total Probability: \[P(A) = P(A \mid P)P(P) + P(A \mid P^c)P(P^c)\] \[P(A) = 0.4 \times 0.3 + 0.2 \times 0.7 = 0.12 + 0.14 = \boxed{0.26}\]

Example 2: Applying Bayes’ Formula

Question: \(P(\text{accident-prone} \mid \text{accident occurred})\) = ?

Using Bayes’ Formula: \[P(P \mid A) = \frac{P(A \mid P)P(P)}{P(A)}\]

Solution: \[P(P \mid A) = \frac{0.4 \times 0.3}{0.26} = \frac{0.12}{0.26} = \frac{6}{13} \approx \boxed{0.462}\]

Interpretation

Before accident: 30% chance of being accident-prone
After accident: 46.2% chance - evidence updated our belief!

Example 3: Multiple Choice Test Analysis

Scenario: Student either knows answer (probability \(p\)) or guesses (probability \(1-p\)).

Given:

If guessing: \(P(\text{correct}) = \frac{1}{m}\) (where \(m\) = number of choices)
Student answered correctly

Question: \(P(\text{student knew answer} \mid \text{answered correctly})\) = ?

⏱️ Group Activity: Work in pairs for 3 minutes

Example 3: Solution Steps

Define Events:

\(K\) = student knows answer
\(C\) = student answers correctly

Given:

\(P(K) = p\), \(P(K^c) = 1-p\)
\(P(C \mid K) = 1\), \(P(C \mid K^c) = \frac{1}{m}\)

Step 1 - Total Probability: \[P(C) = P(C \mid K)P(K) + P(C \mid K^c)P(K^c) = 1 \cdot p + \frac{1}{m}(1-p)\]

Example 3: Final Solution

Step 2 - Apply Bayes’ Formula: \[P(K \mid C) = \frac{P(C \mid K)P(K)}{P(C)} = \frac{1 \cdot p}{p + \frac{1-p}{m}}\]

Simplifying: \[\boxed{P(K \mid C) = \frac{mp}{mp + 1 - p} = \frac{mp}{(m-1)p + 1}}\]

Special Cases

If \(m = 2\) (True/False): \(P(K \mid C) = \frac{2p}{p+1}\)
If \(p = 0.5\) and \(m = 4\): \(P(K \mid C) = \frac{2}{2.5} = 0.8\)

Student Activity: Disease Testing Problem

The Classic Medical Test Problem:

Given Data:

Disease prevalence: 0.5% of population
Test sensitivity: 95% (detects disease when present)
False positive rate: 1% (positive result for healthy person)

🎯 Your Challenge: Person tests positive. What’s \(P(\text{has disease})\)?

⏱️ Time: 5 minutes in pairs

📝 Show: Complete calculation steps

Example 4: Medical Test Solution

Define Events:

\(D\) = has disease, \(T\) = positive test result

Given:

\(P(D) = 0.005\), \(P(D^c) = 0.995\)
\(P(T \mid D) = 0.95\), \(P(T \mid D^c) = 0.01\)

Step 1 - Total Probability: \[P(T) = 0.95 \times 0.005 + 0.01 \times 0.995 = 0.00475 + 0.00995 = 0.0147\]

Example 4: Surprising Result

Step 2 - Bayes’ Formula: \[P(D \mid T) = \frac{P(T \mid D)P(D)}{P(T)} = \frac{0.95 \times 0.005}{0.0147} = \frac{0.00475}{0.0147} \approx \boxed{0.323}\]

🚨 Surprising Result

Only 32.3% chance of having the disease despite positive test!

Why? The disease is rare, so false positives outnumber true positives.

🏥 Clinical Implication: This is why retesting or additional tests are often needed.

Medical Dilemma: Complex Scenario

Real-World Application: Medical practitioner’s decision-making process

Initial Setup:

Initially 60% certain patient has disease
Surgery recommended if ≥80% certain
Test A: Always positive if disease present, rarely false positive

Plot Twist: After positive test, patient reveals diabetes!

Test A gives 30% false positive rate for diabetic patients without disease

🏥 Clinical Decision: Surgery or more tests?

Medical Dilemma: Structured Analysis

Problem Components:

Prior belief: \(P(\text{disease}) = 0.6\)
Test characteristics: \(P(\text{positive}|\text{disease}) = 1.0\)
Complication: \(P(\text{positive}|\text{no disease, diabetic}) = 0.3\)
Decision threshold: Surgery if \(P(\text{disease}|\text{evidence}) \geq 0.8\)

👥 Think-Pair-Share: How does the diabetes information change our analysis?

Medical Dilemma: Complete Solution

Define Events:

\(D\) = patient has disease
\(T^+\) = positive test result
Given: patient is diabetic

Updated Calculation: \[P(D \mid T^+) = \frac{P(T^+ \mid D) \cdot P(D)}{P(T^+ \mid D) \cdot P(D) + P(T^+ \mid D^c, \text{diabetic}) \cdot P(D^c)}\]

Medical Dilemma: Final Answer

Solution: \[P(D \mid T^+) = \frac{1.0 \times 0.6}{1.0 \times 0.6 + 0.3 \times 0.4} = \frac{0.6}{0.6 + 0.12} = \frac{0.6}{0.72} \approx \boxed{0.833}\]

✅ Decision

Since 83.3% > 80%, recommend surgery!

Note: Without the diabetes information, probability would be higher. The complication reduces certainty but still exceeds threshold.

Example 5: Criminal Investigation

Detective’s Dilemma: Inspector is 60% convinced of suspect’s guilt.

New Evidence: Criminal has certain characteristic (left-handed, bald, brown hair)

20% of population has this characteristic
Suspect has the characteristic

Question: How certain should the inspector now be?

🕵️ Think-Write-Pair: Work through this step by step (3 minutes)

Example 5: Investigation Solution

Define Events:

\(G\) = suspect is guilty
\(C\) = suspect has the characteristic

Assumptions:

\(P(G) = 0.6\) (prior belief)
\(P(C \mid G) = 1.0\) (guilty person definitely has it)
\(P(C \mid G^c) = 0.2\) (20% of innocent people have it)

Example 5: Calculation

Solution: \[P(G \mid C) = \frac{P(C \mid G)P(G)}{P(C \mid G)P(G) + P(C \mid G^c)P(G^c)}\] \[= \frac{1.0 \times 0.6}{1.0 \times 0.6 + 0.2 \times 0.4} = \frac{0.6}{0.6 + 0.08} = \frac{0.6}{0.68} \approx \boxed{0.882}\]

Interpretation

New certainty: 88.2% - much stronger case!

Evidence increased probability from 60% to 88.2%.

Bridge Championship Scandal

Historical Case: 1965 Buenos Aires World Bridge Championships

Accusation: British pair Reese and Schapiro accused of cheating using finger signals

Legal Proceedings:

Prosecution: “Their play was consistent with having illicit knowledge”
Defense: “Their play was also consistent with standard strategy”
Prosecution Counter: “Consistency with guilt counts as evidence”

🤔 Critical Thinking: What’s wrong with the prosecution’s reasoning?

Bridge Case: Bayesian Analysis

The Fallacy:

Prosecution only considered \(P(\text{play pattern} \mid \text{guilty})\)

But Bayes requires comparing: \[\frac{P(\text{pattern} \mid \text{guilty})}{P(\text{pattern} \mid \text{innocent})}\]

💡 Bayesian Insight

Evidence only favors guilt if the play pattern is more likely under guilt than innocence.

If \(P(\text{pattern} \mid \text{guilty}) \approx P(\text{pattern} \mid \text{innocent})\), the evidence is neutral!

Understanding Odds

Definition: Odds

The odds of event \(A\) are: \[\boxed{\text{Odds}(A) = \frac{P(A)}{P(A^c)} = \frac{P(A)}{1 - P(A)}}\]

Interpretation: How much more likely \(A\) is than not-\(A\)

Example: If \(P(A) = 0.75\), then \(\text{Odds}(A) = \frac{0.75}{0.25} = 3\) (written as “3 to 1”)

Bayes’ Formula for Odds

Odds Form of Bayes’ Theorem

\[\boxed{\frac{P(H \mid E)}{P(H^c \mid E)} = \frac{P(H)}{P(H^c)} \times \frac{P(E \mid H)}{P(E \mid H^c)}}\]

Or in words:

\[\boxed{\text{Posterior Odds} = \text{Prior Odds} \times \text{Likelihood Ratio}}\]

💡 Key Insight: Evidence multiplies prior odds by the likelihood ratio!

Example 6: Coin Type Identification

Setup: Urn contains 2 Type A coins and 1 Type B coin

Coin Properties:

Type A: \(P(\text{heads}) = \frac{1}{4}\)
Type B: \(P(\text{heads}) = \frac{3}{4}\)

Experiment: Random coin selected, flipped, shows heads

Question: \(P(\text{Type A coin} \mid \text{heads})\) = ?

Example 6: Solution Process

Define Events:

\(A\) = Type A coin selected
\(H\) = heads result

Prior Probabilities:

\(P(A) = \frac{2}{3}\), \(P(B) = \frac{1}{3}\)

Likelihoods:

\(P(H|A) = \frac{1}{4}\), \(P(H|B) = \frac{3}{4}\)

Example 6: Calculation

Solution using Bayes: \[P(A \mid H) = \frac{P(H \mid A)P(A)}{P(H \mid A)P(A) + P(H \mid B)P(B)}\] \[= \frac{\frac{1}{4} \times \frac{2}{3}}{\frac{1}{4} \times \frac{2}{3} + \frac{3}{4} \times \frac{1}{3}} = \frac{\frac{1}{6}}{\frac{1}{6} + \frac{1}{4}} = \frac{\frac{1}{6}}{\frac{5}{12}} = \frac{2}{5} = \boxed{0.4}\]

Interpretation

Before flip: \(P(A) = 66.7\%\)
After heads: \(P(A|H) = 40\%\)
Observing heads makes Type B more likely!

Total Probability Law (General)

General Formula

If \(F_1, F_2, \ldots, F_n\) are mutually exclusive and exhaustive: \[\boxed{P(E) = \sum_{i=1}^{n} P(E \mid F_i)P(F_i)}\]

Bayes’ Formula (General): \[\boxed{P(F_j \mid E) = \frac{P(E \mid F_j)P(F_j)}{\sum_{i=1}^{n} P(E \mid F_i)P(F_i)}}\]

Example 7: Missing Plane Search

Scenario: Plane missing, equally likely in 3 regions

Search Probabilities:

\(P(\text{find} \mid \text{plane in region } i, \text{ search region } i) = 1 - \beta_i\)
Region 1 searched unsuccessfully

Question: \(P(\text{plane in region } i \mid \text{unsuccessful search of region 1})\) = ?

🛩️ Group Challenge: Set up and solve for \(i = 1, 2, 3\) (4 minutes)

Example 7: Search Solution

Define Events:

\(R_i\) = plane in region \(i\)
\(U_1\) = unsuccessful search of region 1

Given: \(P(R_i) = \frac{1}{3}\) for all \(i\)

Likelihoods:

\(P(U_1 \mid R_1) = \beta_1\) (plane there but not found)
\(P(U_1 \mid R_2) = P(U_1 \mid R_3) = 1\) (plane not there)

Example 7: Final Answers

Total Probability: \[P(U_1) = \beta_1 \cdot \frac{1}{3} + 1 \cdot \frac{1}{3} + 1 \cdot \frac{1}{3} = \frac{\beta_1 + 2}{3}\]

Solutions: \[\boxed{P(R_1 \mid U_1) = \frac{\beta_1}{2 + \beta_1}}\]

\[\boxed{P(R_2 \mid U_1) = P(R_3 \mid U_1) = \frac{1}{2 + \beta_1}}\]

🔍 Insight: If \(\beta_1\) is small (good search), plane is unlikely in region 1.

Example 8: Three-Card Problem

Classic Puzzle: 3 identical cards in a hat

Card Types:

Card 1: Red-Red
Card 2: Black-Black
Card 3: Red-Black

Experiment: Draw card, place down, see red side up

Question: \(P(\text{other side is black})\) = ?

⚠️ Common Mistake: Thinking it’s \(\frac{1}{2}\)

Example 8: Careful Analysis

Sample Space: 6 equally likely sides could be showing

RR card: Red₁, Red₂
BB card: Black₁, Black₂
RB card: Red₃, Black₃

Given: Red side showing, so we have Red₁, Red₂, or Red₃

Analysis:

If Red₁ showing → other side is Red₂
If Red₂ showing → other side is Red₁
If Red₃ showing → other side is Black₃

Example 8: Solution

Solution: \(P(\text{other side black} \mid \text{red showing}) = \frac{1}{3}\)

💡 Key Insight

RR card has twice the chance to show red!

Of 3 red sides, only 1 has black on the other side.

Formal Solution: \[P(\text{Black on other} \mid \text{Red showing}) = \frac{P(\text{Red showing} \mid \text{RB card})P(\text{RB card})}{P(\text{Red showing})}\] \[= \frac{\frac{1}{2} \times \frac{1}{3}}{\frac{1}{2}} = \frac{1}{3}\]

Example 9: Why It’s Unsolvable

Missing Information: How was this child selected?

Scenario A: Random selection from the two children

Given: One child is a girl
Answer: \(P(\text{both girls}) = \frac{1}{2}\)

Scenario B: Mother always walks with a girl if she has one

Given: Walking with a girl
Answer: \(P(\text{both girls}) = \frac{1}{3}\)

💡 Lesson

The probability depends crucially on the selection mechanism!

Advanced Application: Flashlight Quality

Quality Control Problem: Bin contains 3 types of flashlights

Performance Data:

Type 1: \(P(>100 \text{ hours}) = 0.7\), comprises 20% of bin
Type 2: \(P(>100 \text{ hours}) = 0.4\), comprises 30% of bin
Type 3: \(P(>100 \text{ hours}) = 0.3\), comprises 50% of bin

Questions:

\(P(\text{random flashlight lasts} >100 \text{ hours})\) = ?
Given flashlight lasted >100 hours, \(P(\text{it was Type } j)\) = ?

Flashlight Solution: Part A

Part A: Overall probability of lasting >100 hours

Using Total Probability: \[P(L) = P(L \mid T_1)P(T_1) + P(L \mid T_2)P(T_2) + P(L \mid T_3)P(T_3)\] \[= 0.7 \times 0.2 + 0.4 \times 0.3 + 0.3 \times 0.5\] \[= 0.14 + 0.12 + 0.15 = \boxed{0.41}\]

Flashlight Solution: Part B

Part B: Given flashlight lasted >100 hours, find type probabilities

Using Bayes’ Formula: \[P(T_1 \mid L) = \frac{0.7 \times 0.2}{0.41} = \frac{0.14}{0.41} \approx \boxed{0.341}\]

\[P(T_2 \mid L) = \frac{0.4 \times 0.3}{0.41} = \frac{0.12}{0.41} \approx \boxed{0.293}\]

\[P(T_3 \mid L) = \frac{0.3 \times 0.5}{0.41} = \frac{0.15}{0.41} \approx \boxed{0.366}\]

Flashlight: Surprising Insight

Before Testing:

Type 1: 20%, Type 2: 30%, Type 3: 50%

After Lasting >100 hours:

Type 1: 34.1%, Type 2: 29.3%, Type 3: 36.6%

🤯 Insight

Type 3 becomes most likely, despite being worst initially!

Why? It’s so common that even with lower quality, it contributes most to successful flashlights.

Interactive Quiz: Test Your Understanding

Press ‘c’ to check your answer, ‘r’ to reset, ‘s’ to shuffle questions

Q1: What does Bayes’ formula allow us to do?

Add probabilities together
Calculate P(cause|effect) from P(effect|cause)
Find independence between events
Multiply probabilities

Q2: Medical Test Result

In the medical test with 0.5% prevalence and 95% sensitivity, why is P(disease|positive) only 32%?

The test is unreliable and poorly designed
Disease is rare, so false positives outnumber true positives
There was a calculation error in the example
The test sensitivity should be higher

Q3: Likelihood Ratio

What is the likelihood ratio in Bayes’ odds form?

P(H)/P(H^c)
P(E|H)/P(E|H^c)
P(H|E)/P(H^c|E)
P(E)/P(E^c)

Key Formulas Summary

Essential Formulas

Bayes’ Formula: \[\boxed{P(H \mid E) = \frac{P(E \mid H)P(H)}{P(E)}}\]

Total Probability: \[\boxed{P(E) = \sum_{i} P(E \mid H_i)P(H_i)}\]

Odds Form: \[\boxed{\text{Posterior Odds} = \text{Prior Odds} \times \text{Likelihood Ratio}}\]

Problem-Solving Strategy Guide

Step 1: Clearly define all events

Step 2: Identify given vs. asked

Step 3: Determine Bayes’ or Total Probability

Step 4: Set up formula with values

Step 5: Compute and interpret

⚠️ Common Pitfalls:

Confusing P(A|B) with P(B|A)
Forgetting Total Probability for denominator
Making unjustified independence assumptions
Ignoring base rates (prior probabilities)

Real-World Applications

Where Bayes’ Theorem matters:

🏥 Medicine: Disease diagnosis, treatment decisions
⚖️ Law: Evidence evaluation, jury reasoning
📧 Technology: Spam filters, recommendation systems
💰 Finance: Risk assessment, fraud detection
🤖 AI/ML: Bayesian networks, probabilistic models
🔬 Science: Hypothesis testing, experimental design

💡 Core Principle

Bayes’ Theorem is the mathematical foundation for rational updating of beliefs in light of new evidence.

Looking Ahead

Next Topics:

Random Variables and Distributions
Expected Value and Variance
Common Distributions: Binomial, Poisson, Normal
Continuous Probability

Connections:

Today’s conditional probability foundation
Independence from previous lecture
Building toward statistical inference

🌟 Remember: Bayes’ Theorem is the heart of rational reasoning under uncertainty - you’ll use it throughout statistics and beyond!

Final Reflection

🤔 Think About:

How does Bayes’ Theorem change how you evaluate evidence?
When have you encountered base rate neglect in real life?
How can prior beliefs be both helpful and harmful?

💬 Discussion Question:

“In the medical test example, most people guess 95% instead of 32%. Why do humans struggle with Bayesian reasoning? What are the practical implications?”

Questions?

Thank you!

Office Hours: By appointment via email

Contact: sorujov@ada.edu.az

Bayes’ Formula

Overview

Today’s Journey

Learning Objectives

Think-Pair-Share: Medical Test Intuition

Bayes’ Formula Foundation

Venn Diagram Visualization

Bayes’ Formula Definition

Example 1: Insurance Claims Analysis

Example 1: Solution Process

Think-Pair-Share: Reverse Probability

Example 2: Applying Bayes’ Formula

Example 3: Multiple Choice Test Analysis

Example 3: Solution Steps

Example 3: Final Solution

Student Activity: Disease Testing Problem

Example 4: Medical Test Solution

Example 4: Surprising Result

Medical Dilemma: Complex Scenario

Medical Dilemma: Structured Analysis

Medical Dilemma: Complete Solution

Medical Dilemma: Final Answer

Example 5: Criminal Investigation

Example 5: Investigation Solution

Example 5: Calculation

Bridge Championship Scandal

Bridge Case: Bayesian Analysis

Understanding Odds

Bayes’ Formula for Odds

Example 6: Coin Type Identification

Example 6: Solution Process

Example 6: Calculation

Total Probability Law (General)

Example 7: Missing Plane Search

Example 7: Search Solution

Example 7: Final Answers

Example 8: Three-Card Problem

Example 8: Careful Analysis

Example 8: Solution

Think-Pair-Share: Family Composition

Example 9: Why It’s Unsolvable

Advanced Application: Flashlight Quality

Flashlight Solution: Part A

Flashlight Solution: Part B

Flashlight: Surprising Insight

Interactive Quiz: Test Your Understanding

Q1: What does Bayes’ formula allow us to do?

Q2: Medical Test Result

Q3: Likelihood Ratio

Key Formulas Summary

Problem-Solving Strategy Guide

Real-World Applications

Looking Ahead

Final Reflection

Questions?