Bayes’ Formula

Mathematical Statistics

Samir Orujov, PhD

ADA University, School of Business

Information Communication Technologies Agency, Statistics Unit

2025-10-18

Overview

Today’s Journey

  • 🎯 Bayes’ Formula Foundation
  • 📊 Total Probability Law
  • 🏥 Real-world Applications
  • 🔄 Updating Beliefs with Evidence

Learning Objectives

  • ✅ Master Bayes’ Theorem applications
  • 🧮 Solve complex probability problems
  • 🔄 Update probabilities with new evidence
  • ⚖️ Apply odds and likelihood ratios

Think-Pair-Share: Medical Test Intuition

🏥 Scenario: A disease affects 0.5% of population…

Think (1 minute): If a test is 95% accurate, and you test positive, what’s the probability you have the disease?

👥 Pair (2 minutes): Discuss your reasoning

🗣️ Share: Let’s hear some estimates before we solve it!

Bayes’ Formula Foundation

The Foundation

Basic Setup: Let \(E\) and \(F\) be events. Then \(E = EF \cup EF^c\)

Total Probability Law: \[P(E) = P(EF) + P(EF^c) = P(E \mid F)P(F) + P(E \mid F^c)P(F^c)\]

Expanding: \[P(E) = P(E \mid F)P(F) + P(E \mid F^c)(1 - P(F))\]

Venn Diagram Visualization

Sample Space S E F EF EFc

Key Insight: Event \(E\) can happen in two mutually exclusive ways:

  • With \(F\) occurring: \(EF\)
  • With \(F\) not occurring: \(EF^c\)

Bayes’ Formula Definition

Definition: Bayes’ Formula

Given events \(E\) and \(F\) with \(P(E) > 0\):

\[\boxed{P(F \mid E) = \frac{P(E \mid F)P(F)}{P(E)}}\]

Or equivalently:

\[\boxed{P(F \mid E) = \frac{P(E \mid F)P(F)}{P(E \mid F)P(F) + P(E \mid F^c)P(F^c)}}\]

💡 Key Insight: Bayes’ formula allows us to “reverse” conditional probabilities!

Example 1: Insurance Claims Analysis

Problem: Insurance company divides people into accident-prone and not accident-prone.

Given Data:

  • Accident-prone: \(P(\text{accident in 1 year}) = 0.4\)
  • Not accident-prone: \(P(\text{accident in 1 year}) = 0.2\)
  • 30% of population is accident-prone

Question: What’s the probability a new policyholder has an accident within a year?

Example 1: Solution Process

Define Events:

  • \(A\) = accident occurs
  • \(P\) = person is accident-prone

Given Information:

  • \(P(P) = 0.3\), so \(P(P^c) = 0.7\)
  • \(P(A \mid P) = 0.4\)
  • \(P(A \mid P^c) = 0.2\)

Solution using Total Probability: \[P(A) = P(A \mid P)P(P) + P(A \mid P^c)P(P^c)\] \[P(A) = 0.4 \times 0.3 + 0.2 \times 0.7 = 0.12 + 0.14 = \boxed{0.26}\]

Think-Pair-Share: Reverse Probability

🤔 New Question: If a policyholder has an accident, what’s the probability they’re accident-prone?

Think (2 minutes): How does this differ from the previous problem?

👥 Pair: Compare your approaches with a neighbor

🔑 Key Difference: We know \(P(A \mid P)\) but need \(P(P \mid A)\) - this is where Bayes shines!

Example 2: Applying Bayes’ Formula

Question: \(P(\text{accident-prone} \mid \text{accident occurred})\) = ?

Using Bayes’ Formula: \[P(P \mid A) = \frac{P(A \mid P)P(P)}{P(A)}\]

Solution: \[P(P \mid A) = \frac{0.4 \times 0.3}{0.26} = \frac{0.12}{0.26} = \frac{6}{13} \approx \boxed{0.462}\]

Interpretation

Before accident: 30% chance of being accident-prone
After accident: 46.2% chance - evidence updated our belief!

Example 3: Multiple Choice Test Analysis

Scenario: Student either knows answer (probability \(p\)) or guesses (probability \(1-p\)).

Given:

  • If guessing: \(P(\text{correct}) = \frac{1}{m}\) (where \(m\) = number of choices)
  • Student answered correctly

Question: \(P(\text{student knew answer} \mid \text{answered correctly})\) = ?

⏱️ Group Activity: Work in pairs for 3 minutes

Example 3: Solution Steps

Define Events:

  • \(K\) = student knows answer
  • \(C\) = student answers correctly

Given:

  • \(P(K) = p\), \(P(K^c) = 1-p\)
  • \(P(C \mid K) = 1\), \(P(C \mid K^c) = \frac{1}{m}\)

Step 1 - Total Probability: \[P(C) = P(C \mid K)P(K) + P(C \mid K^c)P(K^c) = 1 \cdot p + \frac{1}{m}(1-p)\]

Example 3: Final Solution

Step 2 - Apply Bayes’ Formula: \[P(K \mid C) = \frac{P(C \mid K)P(K)}{P(C)} = \frac{1 \cdot p}{p + \frac{1-p}{m}}\]

Simplifying: \[\boxed{P(K \mid C) = \frac{mp}{mp + 1 - p} = \frac{mp}{(m-1)p + 1}}\]

Special Cases

  • If \(m = 2\) (True/False): \(P(K \mid C) = \frac{2p}{p+1}\)
  • If \(p = 0.5\) and \(m = 4\): \(P(K \mid C) = \frac{2}{2.5} = 0.8\)

Student Activity: Disease Testing Problem

The Classic Medical Test Problem:

Given Data:

  • Disease prevalence: 0.5% of population
  • Test sensitivity: 95% (detects disease when present)
  • False positive rate: 1% (positive result for healthy person)

🎯 Your Challenge: Person tests positive. What’s \(P(\text{has disease})\)?

⏱️ Time: 5 minutes in pairs

📝 Show: Complete calculation steps

Example 4: Medical Test Solution

Define Events:

  • \(D\) = has disease, \(T\) = positive test result

Given:

  • \(P(D) = 0.005\), \(P(D^c) = 0.995\)
  • \(P(T \mid D) = 0.95\), \(P(T \mid D^c) = 0.01\)

Step 1 - Total Probability: \[P(T) = 0.95 \times 0.005 + 0.01 \times 0.995 = 0.00475 + 0.00995 = 0.0147\]

Example 4: Surprising Result

Step 2 - Bayes’ Formula: \[P(D \mid T) = \frac{P(T \mid D)P(D)}{P(T)} = \frac{0.95 \times 0.005}{0.0147} = \frac{0.00475}{0.0147} \approx \boxed{0.323}\]

🚨 Surprising Result

Only 32.3% chance of having the disease despite positive test!

Why? The disease is rare, so false positives outnumber true positives.

🏥 Clinical Implication: This is why retesting or additional tests are often needed.

Medical Dilemma: Complex Scenario

Real-World Application: Medical practitioner’s decision-making process

Initial Setup:

  • Initially 60% certain patient has disease
  • Surgery recommended if ≥80% certain
  • Test A: Always positive if disease present, rarely false positive

Plot Twist: After positive test, patient reveals diabetes!

  • Test A gives 30% false positive rate for diabetic patients without disease

🏥 Clinical Decision: Surgery or more tests?

Medical Dilemma: Structured Analysis

Problem Components:

  1. Prior belief: \(P(\text{disease}) = 0.6\)
  2. Test characteristics: \(P(\text{positive}|\text{disease}) = 1.0\)
  3. Complication: \(P(\text{positive}|\text{no disease, diabetic}) = 0.3\)
  4. Decision threshold: Surgery if \(P(\text{disease}|\text{evidence}) \geq 0.8\)

👥 Think-Pair-Share: How does the diabetes information change our analysis?

Medical Dilemma: Complete Solution

Define Events:

  • \(D\) = patient has disease
  • \(T^+\) = positive test result
  • Given: patient is diabetic

Updated Calculation: \[P(D \mid T^+) = \frac{P(T^+ \mid D) \cdot P(D)}{P(T^+ \mid D) \cdot P(D) + P(T^+ \mid D^c, \text{diabetic}) \cdot P(D^c)}\]

Medical Dilemma: Final Answer

Solution: \[P(D \mid T^+) = \frac{1.0 \times 0.6}{1.0 \times 0.6 + 0.3 \times 0.4} = \frac{0.6}{0.6 + 0.12} = \frac{0.6}{0.72} \approx \boxed{0.833}\]

✅ Decision

Since 83.3% > 80%, recommend surgery!

Note: Without the diabetes information, probability would be higher. The complication reduces certainty but still exceeds threshold.

Example 5: Criminal Investigation

Detective’s Dilemma: Inspector is 60% convinced of suspect’s guilt.

New Evidence: Criminal has certain characteristic (left-handed, bald, brown hair)

  • 20% of population has this characteristic
  • Suspect has the characteristic

Question: How certain should the inspector now be?

🕵️ Think-Write-Pair: Work through this step by step (3 minutes)

Example 5: Investigation Solution

Define Events:

  • \(G\) = suspect is guilty
  • \(C\) = suspect has the characteristic

Assumptions:

  • \(P(G) = 0.6\) (prior belief)
  • \(P(C \mid G) = 1.0\) (guilty person definitely has it)
  • \(P(C \mid G^c) = 0.2\) (20% of innocent people have it)

Example 5: Calculation

Solution: \[P(G \mid C) = \frac{P(C \mid G)P(G)}{P(C \mid G)P(G) + P(C \mid G^c)P(G^c)}\] \[= \frac{1.0 \times 0.6}{1.0 \times 0.6 + 0.2 \times 0.4} = \frac{0.6}{0.6 + 0.08} = \frac{0.6}{0.68} \approx \boxed{0.882}\]

Interpretation

New certainty: 88.2% - much stronger case!

Evidence increased probability from 60% to 88.2%.

Bridge Championship Scandal

Historical Case: 1965 Buenos Aires World Bridge Championships

Accusation: British pair Reese and Schapiro accused of cheating using finger signals

Legal Proceedings:

  • Prosecution: “Their play was consistent with having illicit knowledge”
  • Defense: “Their play was also consistent with standard strategy”
  • Prosecution Counter: “Consistency with guilt counts as evidence”

🤔 Critical Thinking: What’s wrong with the prosecution’s reasoning?

Bridge Case: Bayesian Analysis

The Fallacy:

Prosecution only considered \(P(\text{play pattern} \mid \text{guilty})\)

But Bayes requires comparing: \[\frac{P(\text{pattern} \mid \text{guilty})}{P(\text{pattern} \mid \text{innocent})}\]

💡 Bayesian Insight

Evidence only favors guilt if the play pattern is more likely under guilt than innocence.

If \(P(\text{pattern} \mid \text{guilty}) \approx P(\text{pattern} \mid \text{innocent})\), the evidence is neutral!

Understanding Odds

Definition: Odds

The odds of event \(A\) are: \[\boxed{\text{Odds}(A) = \frac{P(A)}{P(A^c)} = \frac{P(A)}{1 - P(A)}}\]

Interpretation: How much more likely \(A\) is than not-\(A\)

Example: If \(P(A) = 0.75\), then \(\text{Odds}(A) = \frac{0.75}{0.25} = 3\) (written as “3 to 1”)

Bayes’ Formula for Odds

Odds Form of Bayes’ Theorem

\[\boxed{\frac{P(H \mid E)}{P(H^c \mid E)} = \frac{P(H)}{P(H^c)} \times \frac{P(E \mid H)}{P(E \mid H^c)}}\]

Or in words:

\[\boxed{\text{Posterior Odds} = \text{Prior Odds} \times \text{Likelihood Ratio}}\]

💡 Key Insight: Evidence multiplies prior odds by the likelihood ratio!

Example 6: Coin Type Identification

Setup: Urn contains 2 Type A coins and 1 Type B coin

Coin Properties:

  • Type A: \(P(\text{heads}) = \frac{1}{4}\)
  • Type B: \(P(\text{heads}) = \frac{3}{4}\)

Experiment: Random coin selected, flipped, shows heads

Question: \(P(\text{Type A coin} \mid \text{heads})\) = ?

Example 6: Solution Process

Define Events:

  • \(A\) = Type A coin selected
  • \(H\) = heads result

Prior Probabilities:

  • \(P(A) = \frac{2}{3}\), \(P(B) = \frac{1}{3}\)

Likelihoods:

  • \(P(H|A) = \frac{1}{4}\), \(P(H|B) = \frac{3}{4}\)

Example 6: Calculation

Solution using Bayes: \[P(A \mid H) = \frac{P(H \mid A)P(A)}{P(H \mid A)P(A) + P(H \mid B)P(B)}\] \[= \frac{\frac{1}{4} \times \frac{2}{3}}{\frac{1}{4} \times \frac{2}{3} + \frac{3}{4} \times \frac{1}{3}} = \frac{\frac{1}{6}}{\frac{1}{6} + \frac{1}{4}} = \frac{\frac{1}{6}}{\frac{5}{12}} = \frac{2}{5} = \boxed{0.4}\]

Interpretation

Before flip: \(P(A) = 66.7\%\)
After heads: \(P(A|H) = 40\%\)
Observing heads makes Type B more likely!

Total Probability Law (General)

General Formula

If \(F_1, F_2, \ldots, F_n\) are mutually exclusive and exhaustive: \[\boxed{P(E) = \sum_{i=1}^{n} P(E \mid F_i)P(F_i)}\]

Bayes’ Formula (General): \[\boxed{P(F_j \mid E) = \frac{P(E \mid F_j)P(F_j)}{\sum_{i=1}^{n} P(E \mid F_i)P(F_i)}}\]

Example 7: Search Solution

Define Events:

  • \(R_i\) = plane in region \(i\)
  • \(U_1\) = unsuccessful search of region 1

Given: \(P(R_i) = \frac{1}{3}\) for all \(i\)

Likelihoods:

  • \(P(U_1 \mid R_1) = \beta_1\) (plane there but not found)
  • \(P(U_1 \mid R_2) = P(U_1 \mid R_3) = 1\) (plane not there)

Example 7: Final Answers

Total Probability: \[P(U_1) = \beta_1 \cdot \frac{1}{3} + 1 \cdot \frac{1}{3} + 1 \cdot \frac{1}{3} = \frac{\beta_1 + 2}{3}\]

Solutions: \[\boxed{P(R_1 \mid U_1) = \frac{\beta_1}{2 + \beta_1}}\]

\[\boxed{P(R_2 \mid U_1) = P(R_3 \mid U_1) = \frac{1}{2 + \beta_1}}\]

🔍 Insight: If \(\beta_1\) is small (good search), plane is unlikely in region 1.

Example 8: Three-Card Problem

Classic Puzzle: 3 identical cards in a hat

Card Types:

  • Card 1: Red-Red
  • Card 2: Black-Black
  • Card 3: Red-Black

Experiment: Draw card, place down, see red side up

Question: \(P(\text{other side is black})\) = ?

⚠️ Common Mistake: Thinking it’s \(\frac{1}{2}\)

Example 8: Careful Analysis

Sample Space: 6 equally likely sides could be showing

  • RR card: Red₁, Red₂
  • BB card: Black₁, Black₂
  • RB card: Red₃, Black₃

Given: Red side showing, so we have Red₁, Red₂, or Red₃

Analysis:

  • If Red₁ showing → other side is Red₂
  • If Red₂ showing → other side is Red₁
  • If Red₃ showing → other side is Black₃

Example 8: Solution

Solution: \(P(\text{other side black} \mid \text{red showing}) = \frac{1}{3}\)

💡 Key Insight

RR card has twice the chance to show red!

Of 3 red sides, only 1 has black on the other side.

Formal Solution: \[P(\text{Black on other} \mid \text{Red showing}) = \frac{P(\text{Red showing} \mid \text{RB card})P(\text{RB card})}{P(\text{Red showing})}\] \[= \frac{\frac{1}{2} \times \frac{1}{3}}{\frac{1}{2}} = \frac{1}{3}\]

Think-Pair-Share: Family Composition

The Ambiguous Problem: Family has 2 children, mother walks with a girl.

Question: \(P(\text{both children are girls} \mid \text{this child is a girl})\) = ?

🚨 Warning: “This problem is incapable of solution without more information”

🤔 Discussion Points:

  1. What assumptions are we making?
  2. How does the selection process matter?
  3. What additional information do we need?

Example 9: Why It’s Unsolvable

Missing Information: How was this child selected?

Scenario A: Random selection from the two children

  • Given: One child is a girl
  • Answer: \(P(\text{both girls}) = \frac{1}{2}\)

Scenario B: Mother always walks with a girl if she has one

  • Given: Walking with a girl
  • Answer: \(P(\text{both girls}) = \frac{1}{3}\)

💡 Lesson

The probability depends crucially on the selection mechanism!

Advanced Application: Flashlight Quality

Quality Control Problem: Bin contains 3 types of flashlights

Performance Data:

  • Type 1: \(P(>100 \text{ hours}) = 0.7\), comprises 20% of bin
  • Type 2: \(P(>100 \text{ hours}) = 0.4\), comprises 30% of bin
  • Type 3: \(P(>100 \text{ hours}) = 0.3\), comprises 50% of bin

Questions:

  1. \(P(\text{random flashlight lasts} >100 \text{ hours})\) = ?
  2. Given flashlight lasted >100 hours, \(P(\text{it was Type } j)\) = ?

Flashlight Solution: Part A

Part A: Overall probability of lasting >100 hours

Using Total Probability: \[P(L) = P(L \mid T_1)P(T_1) + P(L \mid T_2)P(T_2) + P(L \mid T_3)P(T_3)\] \[= 0.7 \times 0.2 + 0.4 \times 0.3 + 0.3 \times 0.5\] \[= 0.14 + 0.12 + 0.15 = \boxed{0.41}\]

Flashlight Solution: Part B

Part B: Given flashlight lasted >100 hours, find type probabilities

Using Bayes’ Formula: \[P(T_1 \mid L) = \frac{0.7 \times 0.2}{0.41} = \frac{0.14}{0.41} \approx \boxed{0.341}\]

\[P(T_2 \mid L) = \frac{0.4 \times 0.3}{0.41} = \frac{0.12}{0.41} \approx \boxed{0.293}\]

\[P(T_3 \mid L) = \frac{0.3 \times 0.5}{0.41} = \frac{0.15}{0.41} \approx \boxed{0.366}\]

Flashlight: Surprising Insight

Before Testing:

  • Type 1: 20%, Type 2: 30%, Type 3: 50%

After Lasting >100 hours:

  • Type 1: 34.1%, Type 2: 29.3%, Type 3: 36.6%

🤯 Insight

Type 3 becomes most likely, despite being worst initially!

Why? It’s so common that even with lower quality, it contributes most to successful flashlights.

Interactive Quiz: Test Your Understanding

Press ‘c’ to check your answer, ‘r’ to reset, ‘s’ to shuffle questions

Q1: What does Bayes’ formula allow us to do?

  • Add probabilities together
  • Calculate P(cause|effect) from P(effect|cause)
  • Find independence between events
  • Multiply probabilities

Q2: Medical Test Result

In the medical test with 0.5% prevalence and 95% sensitivity, why is P(disease|positive) only 32%?

  • The test is unreliable and poorly designed
  • Disease is rare, so false positives outnumber true positives
  • There was a calculation error in the example
  • The test sensitivity should be higher

Q3: Likelihood Ratio

What is the likelihood ratio in Bayes’ odds form?

  • P(H)/P(Hc)
  • P(E|H)/P(E|Hc)
  • P(H|E)/P(Hc|E)
  • P(E)/P(Ec)

Key Formulas Summary

Essential Formulas

Bayes’ Formula: \[\boxed{P(H \mid E) = \frac{P(E \mid H)P(H)}{P(E)}}\]

Total Probability: \[\boxed{P(E) = \sum_{i} P(E \mid H_i)P(H_i)}\]

Odds Form: \[\boxed{\text{Posterior Odds} = \text{Prior Odds} \times \text{Likelihood Ratio}}\]

Problem-Solving Strategy Guide

Step 1: Clearly define all events

Step 2: Identify given vs. asked

Step 3: Determine Bayes’ or Total Probability

Step 4: Set up formula with values

Step 5: Compute and interpret

⚠️ Common Pitfalls:

  • Confusing P(A|B) with P(B|A)
  • Forgetting Total Probability for denominator
  • Making unjustified independence assumptions
  • Ignoring base rates (prior probabilities)

Real-World Applications

Where Bayes’ Theorem matters:

  • 🏥 Medicine: Disease diagnosis, treatment decisions
  • ⚖️ Law: Evidence evaluation, jury reasoning
  • 📧 Technology: Spam filters, recommendation systems
  • 💰 Finance: Risk assessment, fraud detection
  • 🤖 AI/ML: Bayesian networks, probabilistic models
  • 🔬 Science: Hypothesis testing, experimental design

💡 Core Principle

Bayes’ Theorem is the mathematical foundation for rational updating of beliefs in light of new evidence.

Looking Ahead

Next Topics:

  • Random Variables and Distributions
  • Expected Value and Variance
  • Common Distributions: Binomial, Poisson, Normal
  • Continuous Probability

Connections:

  • Today’s conditional probability foundation
  • Independence from previous lecture
  • Building toward statistical inference

🌟 Remember: Bayes’ Theorem is the heart of rational reasoning under uncertainty - you’ll use it throughout statistics and beyond!

Final Reflection

🤔 Think About:

  1. How does Bayes’ Theorem change how you evaluate evidence?
  2. When have you encountered base rate neglect in real life?
  3. How can prior beliefs be both helpful and harmful?

💬 Discussion Question:

“In the medical test example, most people guess 95% instead of 32%. Why do humans struggle with Bayesian reasoning? What are the practical implications?”

Questions?

Thank you!

Office Hours: By appointment via email

Contact: sorujov@ada.edu.az