Mathematical Statistics

The Gamma Probability Distribution

Samir Orujov, PhD

ADA University, School of Business

Information Communication Technologies Agency, Statistics Unit

2025-12-06

๐ŸŽฏ Learning Objectives

By the end of this lecture, you will be able to:

  • Define the gamma distribution and identify its shape parameter (\(\alpha\)) and scale parameter (\(\beta\)), and understand how they affect distribution characteristics

  • Compute probabilities, expected values, and variances for gamma-distributed random variables, including exponential and chi-square special cases

  • Apply the exponential distribution to model waiting times and understand the memoryless property in reliability and financial risk contexts

  • Use the chi-square distribution (\(\chi^2\)) for statistical inference, hypothesis testing, and volatility modeling in finance

  • Solve real-world problems involving skewed distributions in insurance claims, component lifetimes, income distributions, and water demand forecasting

๐Ÿ“‹ Overview

๐Ÿ“š Topics Covered Today

  • Skewed Distributions โ€“ Understanding asymmetric data patterns and when they arise

  • Gamma Distribution โ€“ The general form with shape and scale parameters

  • Chi-Square Distribution โ€“ A special case with applications in hypothesis testing and variance estimation

  • Exponential Distribution โ€“ Modeling lifetimes and the memoryless property

  • Applications โ€“ Insurance claims, reliability engineering, water demand, income distributions, and financial risk modeling

๐Ÿ“– Definition: Skewed Distribution

๐Ÿ“ Concept: Skewed Distributions

A skewed distribution occurs when data in a chart lean either to the left or right side of the scale, resulting in a nonsymmetrical curve.

Key Characteristics:

  • Asymmetry: The left side is shaped differently than the right side

  • Tail behavior: One tail is longer than the other, indicating more extreme values on that side

  • Difference from normal: Unlike the Gaussian (normal) distribution, which is symmetric around the mean with zero skewness, skewed distributions have the mean, median, and mode at different locations

  • Real-world prevalence: Many phenomena exhibit skewness, including income distributions, insurance claims, component lifetimes, and asset returns

Financial Context: Income distributions are typically right-skewed (positively skewed) with a long tail extending toward higher incomes, while most people cluster at lower to moderate income levels .

๐Ÿ“Š Types of Skewness

Right-Skewed (Positive Skew)

  • Tail extends to the right
  • Mean > Median > Mode
  • Common in: income, insurance claims, asset returns

Examples:

  • Executive compensation
  • Insurance loss amounts
  • Real estate prices
  • Time to equipment failure

Left-Skewed (Negative Skew)

  • Tail extends to the left
  • Mean < Median < Mode
  • Common in: age at death, test scores with ceiling effects

Examples:

  • Human lifespan (peaks at 75-80 years)
  • Product quality ratings (most cluster at high ratings)
  • Time remaining until retirement

๐Ÿ“Œ Example 1: Human Lifespan Distribution

Problem: The average human life span chart skews left. If the chart shows values from 1 to 100 (representing years of life), explain the distribution characteristics.

Analysis:

The data shows that most people live to around 75 to 80 years old, which means:

  • Peak location: The distributionโ€™s peak (mode) is closer to the right of the chart (near 75-80 years)

  • Tail direction: The chartโ€™s tail is longer on its left side because the values around 75 and 80 are closer to 100 than to 1

  • Asymmetry: Fewer people die at very young ages (due to modern medicine and sanitation), creating a shorter right tail, while infant mortality and premature deaths create a longer left tail

Interpretation: This left-skewed pattern reflects that in developed countries, medical advances have pushed most deaths toward older ages, with exceptional cases of early death creating the left tail. In contrast, thereโ€™s a biological upper limit on lifespan, creating a natural boundary on the right side.

๐Ÿ“– Definition: Gamma Probability Distribution

๐Ÿ“ Definition 1: Gamma Distribution

A random variable \(Y\) is said to have a gamma distribution with parameters \(\alpha > 0\) and \(\beta > 0\) if and only if the probability density function (pdf) of \(Y\) is:

\[f(y) = \begin{cases} \frac{y^{\alpha-1}e^{-y/\beta}}{\beta^\alpha\Gamma(\alpha)}, & 0 \leq y < \infty, \\ 0, & \text{elsewhere}, \end{cases}\]

where the gamma function is defined as:

\[\Gamma(\alpha) = \int_{0}^{\infty} y^{\alpha-1}e^{-y} \,dy\]

Key Properties of the Gamma Function:

  • \(\Gamma(1) = 1\)

  • \(\Gamma(\alpha) = (\alpha - 1)\Gamma(\alpha - 1)\) for any \(\alpha > 1\) (recursive property)

  • \(\Gamma(n) = (n - 1)!\) for positive integers \(n\)

๐Ÿ” Understanding Gamma Parameters

๐Ÿ“ Shape Parameter: \(\alpha\)

Effect on Distribution:

  • Controls the shape of the distribution
  • \(\alpha < 1\): J-shaped (decreasing from infinity at \(y = 0\))
  • \(\alpha = 1\): Exponential distribution
  • \(\alpha > 1\): Unimodal with peak shifting right as \(\alpha\) increases
  • Larger \(\alpha\) makes distribution more symmetric (approaches normal)

Financial Application: In reliability engineering, \(\alpha\) represents the number of stages or phases before failure .

๐Ÿ“ Scale Parameter: \(\beta\)

Effect on Distribution:

  • Controls the scale or spread of the distribution
  • Larger \(\beta\) stretches distribution to the right
  • Smaller \(\beta\) compresses distribution toward zero
  • Does not affect shape, only the x-axis scaling

Financial Application: In insurance, \(\beta\) scales the claim amounts while maintaining the underlying claim frequency pattern represented by \(\alpha\) .

๐Ÿงฎ Theorem: Mean and Variance of Gamma Distribution

Theorem 1: Expected Value and Variance

If \(Y\) has a gamma distribution with parameters \(\alpha\) and \(\beta\), then:

\[\boxed{\mu = E(Y) = \alpha \beta}\]

and

\[\boxed{\sigma^2 = V(Y) = \alpha \beta^2}\]

Derivation Insight: These formulas follow from integration by parts applied to the definition of expected value and variance using the gamma density function.

Important Note: Except when \(\alpha = 1\) (exponential distribution), it is generally impossible to obtain areas under the gamma density function by direct integration. We typically use:

  • Statistical software (R, Python)
  • Numerical integration methods
  • Tables for chi-square distribution (when applicable)
  • Online calculators/applets

๐ŸŽฎ Interactive: Gamma Distribution Explorer

Explore Gamma Parameters: Adjust \(\alpha\) (shape) and \(\beta\) (scale) to see their effects on the distribution.

Observations:

  • Increase ฮฑ: peak moves right, distribution becomes more symmetric

  • Increase ฮฒ: distribution stretches horizontally

  • When ฮฑ = 1: exponential distribution

๐Ÿ“Œ Example 2: Computer Response Time

Problem: The response times on an online computer terminal have approximately a gamma distribution with mean four seconds and variance eight seconds.

  1. Write the probability density function for the response times.

  2. What is the probability that the response time on the terminal is less than five seconds?

Solution (Part a):

Let \(Y\) be the response time variable. Given: \(\mu = 4\) and \(\sigma^2 = 8\).

Using the theorem, we have: \(\mu = \alpha\beta = 4\) and \(\sigma^2 = \alpha\beta^2 = 8\).

From these equations: \[\frac{\sigma^2}{\mu} = \frac{\alpha\beta^2}{\alpha\beta} = \beta = \frac{8}{4} = 2\]

Therefore: \(\alpha = \frac{\mu}{\beta} = \frac{4}{2} = 2\)

๐Ÿ“Œ Example 2: Solution (continued)

With \(\alpha = 2\) and \(\beta = 2\), the probability density function is:

\[f(y) = \begin{cases} \frac{y^{2-1}e^{-y/2}}{2^2\Gamma(2)}, & 0 \leq y < \infty \\ 0, & \text{elsewhere} \end{cases}\]

Since \(\Gamma(2) = 1! = 1\), this simplifies to:

\[\boxed{f(y) = \begin{cases} \frac{ye^{-y/2}}{4}, & 0 \leq y < \infty \\ 0, & \text{elsewhere} \end{cases}}\]

Solution (Part b):

To find \(P(Y < 5)\), we calculate: \[P(Y < 5) = \int_{0}^{5} \frac{y}{4} e^{-\frac{y}{2}} \,dy\]

๐Ÿ“Œ Example 2: Integration by Parts

Integration by Parts:

Let \(u = \frac{y}{4}\) and \(dv = e^{-\frac{y}{2}} dy\)

Then: \(du = \frac{1}{4}dy\) and \(v = -2e^{-\frac{y}{2}}\)

Applying the integration by parts formula \(\int u \, dv = uv - \int v \, du\):

\[\int \frac{y}{4}e^{-\frac{y}{2}} \, dy = -\frac{y}{2} \cdot e^{-\frac{y}{2}} - \int -\frac{1}{2} e^{-\frac{y}{2}}\, dy = -\frac{y}{2} \cdot e^{-\frac{y}{2}} - e^{-\frac{y}{2}}\]

Evaluating from 0 to 5:

\[P(Y < 5) = \left[-\frac{y}{2} \cdot e^{-\frac{y}{2}} - e^{-\frac{y}{2}}\right]^5_0 = -\frac{7}{2} \cdot e^{-\frac{5}{2}} + 1\]

\[\boxed{P(Y < 5) \approx 0.7127 \text{ or } 71.27\%}\]

Interpretation: About 71% of the time, the computer responds within 5 seconds.

๐Ÿค Think-Pair-Share: IT Infrastructure Planning

05:00

๐Ÿ’ญ Student Engagement Activity (5 minutes)

Scenario: You are an IT manager at a financial services company. Server response times follow a gamma distribution with mean \(\mu = 3\) seconds and standard deviation \(\sigma = 2\) seconds. The companyโ€™s service level agreement (SLA) requires that 90% of requests complete within 6 seconds.

Think (1 minute): Work individually

  • Calculate the parameters \(\alpha\) and \(\beta\) for this gamma distribution

  • Does the current system meet the SLA requirement? (Use the fact that \(P(Y < 6) \approx 0.85\) for these parameters)

  • What business impact occurs if the SLA is violated?

Pair (2-3 minutes): Discuss with a partner

  • Compare your parameter calculations

  • Discuss whether system upgrades are needed

  • Consider the trade-off between upgrade costs and SLA compliance

Share (1-2 minutes): Class discussion

  • Selected pairs share their recommendations

  • Discuss how modeling response times helps capacity planning and investment decisions

๐Ÿ“– Definition: Chi-Square Distribution

๐Ÿ“ Definition 2: Chi-Square (\(\chi^2\)) Distribution

Let \(\nu\) be a positive integer. A random variable \(Y\) is said to have a chi-square distribution with \(\nu\) degrees of freedom if and only if \(Y\) is a gamma-distributed random variable with parameters:

\[\alpha = \frac{\nu}{2} \quad \text{and} \quad \beta = 2\]

Notation: We write \(Y \sim \chi^2_\nu\) to denote that \(Y\) has a chi-square distribution with \(\nu\) degrees of freedom.

The pdf becomes: \[f(y) = \begin{cases} \frac{y^{\nu/2-1}e^{-y/2}}{2^{\nu/2}\Gamma(\nu/2)}, & 0 \leq y < \infty \\ 0, & \text{elsewhere} \end{cases}\]

Theorem 2: Mean and Variance of Chi-Square Distribution

If \(Y \sim \chi^2_\nu\), then: \(\boxed{\mu = E(Y) = \nu}\) and \(\boxed{\sigma^2 = V(Y) = 2\nu}\)

๐Ÿ”— Relationship: Gamma to Chi-Square

๐Ÿ“Š Converting Gamma to Chi-Square

Key Result: If \(Y\) has a gamma distribution with \(\alpha = \frac{n}{2}\) for some integer \(n\), then:

\[\frac{2Y}{\beta} \sim \chi^2_n\]

has a chi-square distribution with \(n\) degrees of freedom.

Why This Matters: Chi-square distributions have extensive tables and computational support, making them easier to work with than general gamma distributions.

๐Ÿ“Œ Example 3: Converting Gamma to Chi-Square

Problem: If \(Y\) has a gamma distribution with \(\alpha = 1.5 = \frac{3}{2}\) and \(\beta = 4\), find \(P(Y < 3.5)\) using the chi-square distribution.

Solution:

Since \(\alpha = \frac{3}{2}\), we can use the relationship:

\[\frac{2Y}{\beta} = \frac{2Y}{4} = \frac{Y}{2} \sim \chi^2_3\]

Therefore: \[P(Y < 3.5) = P\left(\frac{Y}{2} < \frac{3.5}{2}\right) = P\left(\frac{Y}{2} < 1.75\right) = P(\chi^2_3 < 1.75)\]

Using chi-square tables or software with \(\nu = 3\) degrees of freedom and value 1.75:

\[\boxed{P(Y < 3.5) = P(\chi^2_3 < 1.75) \approx 0.374}\]

Application: This technique is commonly used in hypothesis testing and variance estimation in financial econometrics .

๐ŸŽฎ Interactive: Chi-Square Distribution by Degrees of Freedom

Explore Chi-Square: Adjust degrees of freedom to see how the distribution shape changes.

Key Patterns:

  • Low ฮฝ: highly right-skewed

  • High ฮฝ: approaches normal

  • Used in hypothesis testing

๐Ÿ“– Definition: Exponential Distribution

๐Ÿ“ Definition 3: Exponential Distribution

A random variable \(Y\) is said to have an exponential distribution with parameter \(\beta > 0\) if and only if the density function of \(Y\) is:

\[f(y) = \begin{cases} \frac{1}{\beta} e^{-y/\beta}, & 0 \leq y < \infty \\ 0, & \text{elsewhere} \end{cases}\]

Note: This is a special case of the gamma distribution with \(\alpha = 1\).

Key Properties:

  • Memoryless property: The probability of an event occurring in the future is independent of how much time has already elapsed

  • Single parameter: Only \(\beta\) determines both shape and scale

  • Common applications: Time between arrivals, component lifetimes, time until failure

Theorem 3: Mean and Variance of Exponential Distribution

If \(Y\) has an exponential distribution with parameter \(\beta\), then: \(\boxed{\mu = E(Y) = \beta}\) and \(\boxed{\sigma^2 = V(Y) = \beta^2}\)

๐Ÿ” The Memoryless Property

๐Ÿ“ Memoryless Property of Exponential Distribution

Definition: A random variable \(Y\) is memoryless if:

\[P(Y > a + b \mid Y > a) = P(Y > b) \; \text{for all}\; a > 0 \; \text{and} \; b > 0.\]

Interpretation:

  • If a component has already lasted \(a\) time units, the probability it lasts an additional \(b\) time units is the same as if it were brand new
  • The past does not affect future probabilities
  • Example: A fuse that hasnโ€™t blown after 100 hours is just as likely to last another 50 hours as a new fuse is to last 50 hours

Mathematical Proof: Using the definition of conditional probability:

\[P(Y > a + b \mid Y > a) = \frac{P(Y > a + b)}{P(Y > a)} = \frac{e^{-(a+b)/\beta}}{e^{-a/\beta}} = e^{-b/\beta} = P(Y > b)\]

๐Ÿ“Œ Example 4: Memoryless Property Verification

Problem: Suppose that \(Y\) has an exponential probability density function with parameter \(\beta\). Show that if \(a > 0\) and \(b > 0\):

\[P(Y > a + b \mid Y > a) = P(Y > b)\]

Solution:

From the definition of conditional probability: \[P(Y > a + b \mid Y > a) = \frac{P(Y > a + b \cap Y > a)}{P(Y > a)}\]

Since \((Y > a + b) \cap (Y > a) = (Y > a + b)\): \[P(Y > a + b \mid Y > a) = \frac{P(Y > a + b)}{P(Y > a)}\]

๐Ÿ“Œ Example 4: Solution (continued)

Computing the probabilities:

\[P(Y > a + b) = \int_{a+b}^{\infty} \frac{1}{\beta}e^{-y/\beta} \,dy = -e^{-y/\beta} \Big|_{a+b}^{\infty} = e^{-(a+b)/\beta}\]

Similarly: \[P(Y > a) = \int_{a}^{\infty} \frac{1}{\beta}e^{-y/\beta} \,dy = e^{-a/\beta}\]

Therefore: \[P(Y > a + b \mid Y > a) = \frac{e^{-(a+b)/\beta}}{e^{-a/\beta}} = e^{-b/\beta} = P(Y > b) \quad \blacksquare\]

Financial Interpretation: In modeling default times for credit risk, the memoryless property implies that a bond that hasnโ€™t defaulted so far is as likely to default in the next period as it was initiallyโ€”which may not be realistic, motivating more complex models .

๐ŸŽฎ Interactive: Exponential Distribution

Explore Exponential Distribution: Adjust \(\beta\) to see how it affects the distribution and memoryless property.

Memoryless:
P(Y > 5 | Y > 2) = P(Y > 3)

Always true for exponential!

๐Ÿ“Œ Example 5: Water Demand at Pumping Station

Problem: The operator of a pumping station has observed that demand for water during early afternoon hours has an approximately exponential distribution with mean 100 cfs (cubic feet per second).

  1. Find the probability that the demand will exceed 200 cfs during the early afternoon on a randomly selected day.

  2. What water-pumping capacity should the station maintain during early afternoons so that the probability that demand will exceed capacity on a randomly selected day is only 0.01?

Solution (Part a):

Since the mean of an exponential random variable with parameter \(\beta\) equals \(\beta\), we have \(\beta = 100\).

The pdf is: \[f(y) = \begin{cases} \frac{1}{100} e^{-y/100}, & 0 \leq y < \infty \\ 0, & \text{elsewhere} \end{cases}\]

๐Ÿ“Œ Example 5: Solution (continued)

To find \(P(Y > 200)\):

\[P(Y > 200) = \int_{200}^{\infty} \frac{1}{100} e^{-\frac{y}{100}} \, dy = -e^{-\frac{y}{100}} \Big|_{200}^{\infty} = e^{-2}\]

\[\boxed{P(Y > 200) = e^{-2} \approx 0.1353 \text{ or } 13.53\%}\]

Interpretation: Thereโ€™s about a 13.5% chance that water demand will exceed 200 cfs on any given afternoon.

Solution (Part b):

This is a reverse problem. We need to find \(y\) such that \(P(Y > y) = 0.01\).

\[P(Y > y) = \int_{y}^{\infty} \frac{1}{100} e^{-\frac{y}{100}} \, dy = e^{-\frac{y}{100}} = 0.01\]

๐Ÿ“Œ Example 5: Solution (continued)

Taking the natural logarithm of both sides:

\[-\frac{y}{100} = \ln(0.01)\]

\[y = -100 \ln(0.01) = 100 \ln(100)\]

\[\boxed{y \approx 460.52 \text{ cfs}}\]

Business Recommendation: The pumping station should maintain a capacity of at least 461 cfs to ensure that demand exceeds capacity on only 1% of days (approximately 3-4 days per year).

Cost-Benefit Analysis: This capacity provides high reliability (99% service level) while avoiding over-investment in excessive capacity that would rarely be needed. The station manager can balance the cost of additional capacity against the cost of water shortages.

๐Ÿ’ฐ Case Study: Insurance Claim Amounts (Real Data)

๐Ÿ“‹ Fire Insurance Loss Modeling

Context: Insurance companies model loss amounts using gamma distributions because losses are naturally right-skewed with a long tail of catastrophic events. We analyze Danish fire insurance losses from 1980-1990.

Key Questions:

  • What are the optimal shape (\(\alpha\)) and scale (\(\beta\)) parameters using MLE?

  • What proportion of losses exceed certain thresholds (e.g., 10M, 20M DKK)?

  • How well does the gamma model fit, and what are its limitations?

๐Ÿ“Š Data Source

We analyze Danish fire insurance losses (1980-1990) - a classic actuarial dataset.

Source: R package fitdistrplus - danishuni dataset

Sample Size: 2,167 individual fire insurance claims

Data Type: Loss amounts in millions of Danish Krone (DKK)

Application: Widely used for demonstrating actuarial modeling, extreme value analysis, and heavy-tailed distributions

๐Ÿ’ฐ Case Study: Data Loading and Parameter Estimation

Code
# Load required libraries
library(tidyverse)
library(fitdistrplus)  # For MLE fitting

# Load real insurance loss data from fitdistrplus package
# Danish fire insurance losses (1980-1990): 2,167 claims
# This is a classic actuarial dataset for gamma distribution modeling
data(danishuni)

# Extract loss amounts (in millions of Danish Krone)
claims <- danishuni$Loss

# Verify we have sufficient data
if (length(claims) < 100) {
  stop(paste("Error: Only", length(claims), "claims loaded. Need at least 100."))
}

cat("Data source: Danish fire insurance losses (1980-1990)\n")
Data source: Danish fire insurance losses (1980-1990)
Code
cat("Original values in millions of Danish Krone\n")
Original values in millions of Danish Krone
Code
# Summary statistics
n_claims <- length(claims)
mean_claim <- mean(claims)
median_claim <- median(claims)
sd_claim <- sd(claims)
max_claim <- max(claims)

cat(sprintf("Sample size: %d claims\n", n_claims))
Sample size: 2167 claims
Code
cat(sprintf("Mean claim: $%.2f\n", mean_claim))
Mean claim: $3.39
Code
cat(sprintf("Median claim: $%.2f\n", median_claim))
Median claim: $1.78
Code
cat(sprintf("Std Dev: $%.2f\n", sd_claim))
Std Dev: $8.51
Code
cat(sprintf("Max claim: $%.2f\n", max_claim))
Max claim: $263.25
Code
# Estimate gamma parameters using Maximum Likelihood Estimation (MLE)
# MLE provides optimal parameter estimates for the gamma distribution
# fitdistrplus uses numerical optimization for robust fitting

fit_gamma <- fitdist(claims, "gamma", method = "mle")

# Extract parameters (note: fitdist uses shape and rate parameterization)
alpha_est <- fit_gamma$estimate["shape"]
rate_est <- fit_gamma$estimate["rate"]
beta_est <- 1 / rate_est  # Convert rate to scale parameter

cat(sprintf("Shape (ฮฑ): %.3f\n", alpha_est))
Shape (ฮฑ): 1.298
Code
cat(sprintf("Scale (ฮฒ): %.2f M DKK\n", beta_est))
Scale (ฮฒ): 2.61 M DKK
Code
cat(sprintf("Rate (1/ฮฒ): %.4f\n", rate_est))
Rate (1/ฮฒ): 0.3833
Code
cat(sprintf("\nEstimated mean: %.2f M DKK\n", 
            alpha_est * beta_est))

Estimated mean: 3.39 M DKK
Code
cat(sprintf("Sample mean: %.2f M DKK\n", mean_claim))
Sample mean: 3.39 M DKK
Code
cat(sprintf("\nEstimated SD: %.2f M DKK\n", 
            sqrt(alpha_est * beta_est^2)))

Estimated SD: 2.97 M DKK
Code
cat(sprintf("Sample SD: %.2f M DKK\n", sd_claim))
Sample SD: 8.51 M DKK
Code
cat(sprintf("\nLog-likelihood: %.2f\n", fit_gamma$loglik))

Log-likelihood: -4767.10

Probabilities of Exceeding Thresholds
=====================================
P(Loss > 5M DKK): 0.2231 (22.31%)
P(Loss > 10M DKK): 0.0384 (3.84%)
P(Loss > 20M DKK): 0.0010 (0.10%)

๐Ÿ’ฐ Case Study: Visualization and Model Validation

๐Ÿ’ฐ Model Fit Assessment: Interpreting Diagnostic Plots

๐Ÿ“Š Observations

From the diagnostic plots:

  • Histogram/Density: Overall fit appears reasonable for the bulk of the data
  • Q-Q Plot (Quantile-Quantile): Shows departure from the reference line in the upper tail (right side)
  • P-P Plot (Probability-Probability): Shows systematic deviation, especially in the tails

โš ๏ธ Why the Departure?

  1. Heavy-tailed nature: Fire insurance losses have heavier tails than the gamma distribution can capture

  2. Extreme values: A few very large losses (>100M DKK) create tail behavior that deviates from gamma

  3. Not a problem with estimation: MLE is correctly estimating gamma parameters, but gamma may not be the ideal model for extreme losses

๐Ÿ’ก Better Alternatives

For heavy-tailed data:

  • Pareto distribution: Specifically designed for heavy tails
  • Log-normal distribution: Often fits insurance claims better
  • Generalized Pareto (GPD): Used in extreme value theory
  • Mixture models: Combine gamma for moderate losses + extreme value distribution for large losses

๐Ÿ“š Key Lesson

Gamma is useful for moderate right-skewed data, but real insurance losses often require more sophisticated models.

๐Ÿ’ฐ Case Study: Actuarial Applications and Pricing

๐Ÿ“Š Distributional Characteristics

Danish Fire Insurance Losses 1980-1990:

  • Shape parameter \(\alpha \approx 1.30\): Indicates moderate right skew typical of insurance losses

  • Scale parameter \(\beta \approx 2.61\) million DKK: Controls the spread of loss amounts

  • Mean loss: \(\alpha \times \beta \approx 3.39\) million DKK per claim

  • Interpretation: \(\alpha > 1\) means the distribution has a mode away from zero, with a substantial right tail for catastrophic losses

๐Ÿ’ผ Business Applications

Premium Pricing Strategy:

  1. Pure premium = Expected loss cost = \(\mu \approx 3.39\) million DKK per policy

  2. Loading for risk: Add safety margin for tail losses using percentiles (e.g., 90th, 95th, 99th)

  3. Risk assessment: Coefficient of variation \(\text{CV} = 1/\sqrt{\alpha} \approx 0.88\) indicates moderate variability

Reserve Estimation: The gamma model with \(\alpha \approx 1.3\) captures realistic fire insurance loss patterns: most claims are moderate, with a long tail for major fires requiring substantial reserves.

๐Ÿ“ Quiz #1: Gamma Distribution Parameters

For a gamma distribution with \(\alpha = 3\) and \(\beta = 5\), what is the mean?

  • \(\mu = 15\)
  • \(\mu = 8\)
  • \(\mu = 75\)
  • \(\mu = 3\)

๐Ÿ“ Quiz #2: Chi-Square Relationship

A random variable \(Y\) has a gamma distribution with \(\alpha = 4\) and \(\beta = 2\). What distribution does \(\frac{2Y}{\beta} = Y\) follow?

  • \(\chi^2\) with 8 degrees of freedom
  • \(\chi^2\) with 4 degrees of freedom
  • \(\chi^2\) with 2 degrees of freedom
  • Exponential with parameter 2

๐Ÿ“ Quiz #3: Exponential Distribution

Which property uniquely characterizes the exponential distribution among continuous distributions?

  • The memoryless property: P(Y > a + b | Y > a) = P(Y > b)
  • It has mean equal to variance
  • It is symmetric around its mean
  • It has bounded support

๐Ÿ“ Quiz #4: Variance Formula

If a gamma-distributed random variable has \(\alpha = 2\) and variance \(\sigma^2 = 32\), what is the scale parameter \(\beta\)?

  • \(\beta = 4\)
  • \(\beta = 16\)
  • \(\beta = 8\)
  • \(\beta = 2\)

๐Ÿ“ Summary

โœ… Key Takeaways

  • The gamma distribution is a flexible two-parameter family (\(\alpha\) and \(\beta\)) that models right-skewed, non-negative continuous data such as waiting times, claim amounts, and component lifetimes

  • Special cases include the exponential distribution (\(\alpha = 1\)) and chi-square distribution (\(\alpha = \nu/2\), \(\beta = 2\)), each with specific applications in reliability, queuing, and statistical inference

  • The exponential distribution possesses the unique memoryless property, making it suitable for modeling random arrivals and component failures where past history doesnโ€™t affect future probabilities

  • Mean and variance formulas (\(\mu = \alpha\beta\) and \(\sigma^2 = \alpha\beta^2\)) allow parameter estimation from sample data using method of moments or maximum likelihood

  • Financial applications include insurance loss modeling, credit risk assessment, reliability engineering, and income distribution analysis, where the gamma family captures realistic skewness and tail behavior

๐Ÿ“š Practice Problems

๐Ÿ“ Homework Problems

Problem 1 (Insurance Claims): An insurance company models claim amounts using a gamma distribution with mean $15,000 and variance $112,500,000. Find: (a) the parameters \(\alpha\) and \(\beta\); (b) the probability a claim exceeds $30,000; (c) the 95th percentile of claim amounts for reserve planning.

Problem 2 (System Reliability): The lifetime of a critical server component follows an exponential distribution with mean 5000 hours. (a) What is the probability the component fails before 3000 hours? (b) Given it has already operated for 2000 hours, what is the probability it operates for at least 4000 additional hours? (c) Compare this to a non-memoryless distributionโ€™s behavior.

Problem 3 (Hypothesis Testing): In testing whether a sample variance differs from a population variance, we use the chi-square distribution. If a sample of size \(n = 25\) has variance \(s^2 = 36\) and weโ€™re testing against a hypothesized \(\sigma^2 = 25\), calculate the chi-square test statistic \(\chi^2 = \frac{(n-1)s^2}{\sigma^2}\) and find the probability of observing a value this extreme or more.

Problem 4 (Water Resources): Daily water consumption in a district follows a gamma distribution with \(\alpha = 3\) and \(\beta = 200\) thousand gallons. What capacity should be maintained to ensure demand is met 99% of days?

๐Ÿ‘‹ Thank You!

๐Ÿ“ฌ Contact Information:

Samir Orujov, PhD

Assistant Professor

School of Business

ADA University

๐Ÿ“ง Email: sorujov@ada.edu.az

๐Ÿข Office: D312

โฐ Office Hours: By appointment

๐Ÿ“… Next Class:

Topic: Sampling Distributions and the Central Limit Theorem

Reading: Chapter 8 (textbook sections on sampling distributions)

Preparation: Review properties of sums of random variables and convolution

โฐ Reminders:

โœ… Complete Practice Problems 1-4

โœ… Explore online gamma/chi-square calculators

โœ… Review integration by parts technique

โœ… Work hard!

โ“ Questions?

๐Ÿ’ฌ Open Discussion (5 minutes)

Key Topics for Discussion:

  • How do insurance companies use gamma distributions to set premiums that balance competitiveness with profitability and solvency requirements?

  • What are the limitations of the memoryless property assumption in financial modeling, and when might alternative distributions (Weibull, log-normal) be more appropriate?

  • How does the chi-square distribution connect to hypothesis testing for variance and goodness-of-fit tests in econometrics?

  • In reliability engineering, how do gamma distributions with different shape parameters model different failure mechanisms (wear-out vs. random failure)?