Information Communication Technologies Agency, Statistics Unit
2025-11-10
๐ฏ Learning Objectives
By the end of this lecture, you will be able to:
Define bivariate and multivariate probability distributions and compute joint, marginal, and conditional probabilities for discrete and continuous random variables
Calculate and interpret covariance and correlation coefficients as measures of linear association between financial assets, and understand their role in portfolio risk management
Apply the concept of conditional distributions to model dependencies between variables and compute conditional expectations in financial contexts (e.g., option pricing, credit risk)
Determine whether random variables are independent using joint and marginal distributions, and understand the implications for portfolio diversification
Construct optimal financial portfolios using bivariate distributions to balance expected return and risk through proper understanding of asset correlations and covariances
๐ Overview
๐ Topics Covered Today
Joint Probability Distributions โ Bivariate discrete and continuous distributions modeling multiple random variables simultaneously
Marginal Distributions โ Extracting individual variable distributions from joint distributions by summation or integration
Conditional Distributions โ Modeling one variable given knowledge of another, essential for risk assessment and forecasting
Covariance and Correlation โ Measuring linear association between variables, fundamental to modern portfolio theory
Independence โ Testing statistical independence and understanding diversification benefits in investment portfolios
๐ Definition: Joint Probability Distribution
๐ Definition 1: Bivariate Discrete Distribution
Let \(X\) and \(Y\) be two discrete random variables. The joint probability mass function (pmf) \(p(x, y)\) is:
\[p(x, y) = P(X = x, Y = y)\]
for all pairs \((x, y)\) in the sample space.
Properties:
Non-negativity: \(p(x, y) \geq 0\) for all \((x, y)\)
Total probability: \(\sum_x \sum_y p(x, y) = 1\) (sum over all possible values)
Event probabilities: For any region \(A\) in the sample space, \(P((X, Y) \in A) = \sum_{(x,y) \in A} p(x, y)\)
Financial Context: Joint distributions model relationships between multiple assets (e.g., stock returns of two companies), interest rate movements and inflation, or trading volume and price changes [web:23][web:28].
๐ Example 1: Joint Distribution of Two Stocks
Problem: A financial analyst models the daily returns of two technology stocks, \(X\) (Stock A) and \(Y\) (Stock B). The joint probability distribution is given in the table below:
\(Y = -1\%\)
\(Y = 0\%\)
\(Y = +1\%\)
\(X = -1\%\)
0.10
0.05
0.05
\(X = 0\%\)
0.15
0.20
0.10
\(X = +2\%\)
0.05
0.10
0.20
Verify that this is a valid joint probability distribution.
Find \(P(X = +2\%, Y = +1\%)\).
Find \(P(X \geq 0\%)\).
Solution (Part a):
Check that all probabilities are non-negative: โ (all entries \(\geq 0\))
Check that the sum equals 1: \[\sum_x \sum_y p(x, y) = 0.10 + 0.05 + \cdots + 0.20 = 1.00 \quad \checkmark\]
๐ Example 1: Solution (continued)
Solution (Part b):
From the table, we directly read: \[\boxed{P(X = +2\%, Y = +1\%) = 0.20}\]
Interpretation: Thereโs a 20% chance that both stocks will have their best outcomes simultaneously (Stock A gains 2% and Stock B gains 1%).
Solution (Part c):
\[P(X \geq 0\%) = P(X = 0\% \text{ or } X = +2\%)\]
Sum all probabilities in the bottom two rows: \[P(X = 0\%) = 0.15 + 0.20 + 0.10 = 0.45\]\[P(X = +2\%) = 0.05 + 0.10 + 0.20 = 0.35\]\[\boxed{P(X \geq 0\%) = 0.45 + 0.35 = 0.80}\]
Financial Insight: Stock A has an 80% probability of non-negative returns, suggesting itโs relatively stable or has a positive expected drift.
๐ Definition: Marginal Probability Distribution
๐ Definition 2: Marginal Distributions
Given a joint pmf \(p(x, y)\), the marginal probability mass functions are obtained by summing over all values of the other variable:
Marginal distribution of \(X\):\[p_X(x) = \sum_y p(x, y)\]
Marginal distribution of \(Y\):\[p_Y(y) = \sum_x p(x, y)\]
Interpretation: The marginal distribution gives the probability distribution of one variable without regard to (or ignoring) the value of the other variable.
Financial Application: If we have a joint distribution of stock returns for two companies, the marginal distribution of one stock gives its individual return distribution regardless of the other stockโs performance [web:31].
๐ Example 2: Marginal Distributions from Example 1
Problem: Find the marginal distributions \(p_X(x)\) and \(p_Y(y)\) for the stock returns in Example 1.
Key Insight: Stock A has higher expected return (0.50% vs 0.05%) but also potentially higher risk. The marginal distributions allow us to analyze each stock independently before considering portfolio effects.
๐ฎ Interactive: Joint and Marginal Distributions
Explore Joint Distributions: Adjust probabilities to see how joint distribution affects marginals.
joint_data = [ {x:-1,y:-1,prob: p11,label:"(-1,-1)"}, {x:-1,y:0,prob: p12,label:"(-1,0)"}, {x:0,y:-1,prob: p21,label:"(0,-1)"}, {x:0,y:0,prob: p22,label:"(0,0)"}, {x:1,y:0,prob: p32,label:"(+1,0)"}, {x:1,y:1,prob: p33,label:"(+1,+1)"}]Plot.plot({width:800,height:450,marginLeft:60,marginBottom:50,x: {label:"X (Stock A Return %)",domain: [-1.5,1.5],ticks: [-1,0,1] },y: {label:"Y (Stock B Return %)",domain: [-1.5,1.5],ticks: [-1,0,1] },marks: [ Plot.dot(joint_data, {x:"x",y:"y",r: d =>Math.sqrt(d.prob) *50,fill:"steelblue",opacity:0.7,stroke:"darkblue",strokeWidth:2,tip:true }), Plot.text(joint_data, {x:"x",y:"y",text: d => d.prob.toFixed(2),fill:"white",fontSize:12,fontWeight:"bold" }), Plot.ruleX([0], {stroke:"gray",strokeDasharray:"3,3"}), Plot.ruleY([0], {stroke:"gray",strokeDasharray:"3,3"}) ],caption:html`Circle size represents probability magnitude. Change sliders to see marginal distributions update.`})
๐ Definition: Conditional Probability Distribution
๐ Definition 3: Conditional Distributions
The conditional probability mass function of \(Y\) given \(X = x\) is:
\[p_{Y|X}(y|x) = P(Y = y \mid X = x) = \frac{p(x, y)}{p_X(x)}\]
provided \(p_X(x) > 0\).
Similarly, the conditional pmf of \(X\) given \(Y = y\) is:
\[p_{X|Y}(x|y) = P(X = x \mid Y = y) = \frac{p(x, y)}{p_Y(y)}\]
provided \(p_Y(y) > 0\).
Interpretation: Conditional distributions describe the probability distribution of one variable when we know the value of the other variable.
Financial Application: Given that the market declined today (conditioning event), what is the probability distribution of an individual stockโs return? This is critical for risk management and hedging strategies [web:25][web:31].
๐ Example 3: Conditional Distribution
Problem: Using the stock return data from Example 1, find the conditional distribution of \(Y\) (Stock B) given that \(X = +2\%\) (Stock A had a strong gain).
Solution:
First, recall the marginal probability: \(p_X(+2\%) = 0.35\)
From the joint distribution when \(X = +2\%\): - \(p(+2\%, -1\%) = 0.05\) - \(p(+2\%, 0\%) = 0.10\) - \(p(+2\%, +1\%) = 0.20\)
Key Financial Insight: When Stock A performs very well (+2%), Stock B is much more likely to also perform well (+1% with 57% probability vs. 35% unconditionally). This suggests positive correlation between the stocksโthey tend to move together. This has important implications for portfolio diversification: these two stocks do not provide strong diversification benefits since theyโre positively correlated.
Much higher than the unconditional expected return of \(E(Y) = 0.05\%\)!
๐ค Think-Pair-Share: Portfolio Construction
๐ญ Student Engagement Activity (5 minutes)
Scenario: You are managing a $1,000,000 portfolio and considering two stocks with the joint distribution from Example 1. Stock A has \(E(X) = 0.50\%\) daily return and Stock B has \(E(Y) = 0.05\%\) daily return. You notice that when Stock A drops (-1%), Stock Bโs conditional distribution becomes more favorable.
Think (1 minute): Work individually
Calculate the conditional expected return \(E(Y \mid X = -1\%)\) using the joint distribution
Compare it to the unconditional \(E(Y) = 0.05\%\)
Does this suggest positive or negative correlation between the stocks?
Pair (2-3 minutes): Discuss with a partner
If the stocks were negatively correlated (one tends to rise when the other falls), how would this affect portfolio risk?
Would you invest 100% in Stock A (higher expected return) or create a diversified portfolio? Why?
Share (1-2 minutes): Class discussion
Discuss the trade-off between expected return and risk reduction through diversification
How does understanding conditional distributions help in making better portfolio decisions?
๐ Definition: Continuous Joint Distribution
๐ Definition 4: Bivariate Continuous Distribution
Let \(X\) and \(Y\) be continuous random variables. The joint probability density function (pdf) \(f(x, y)\) satisfies:
Conditional densities: \(f_{Y|X}(y|x) = \frac{f(x, y)}{f_X(x)}\) for \(f_X(x) > 0\)
Application: Continuous joint distributions model asset returns, interest rates, volatilities, and other financial variables that can take any value in a continuous range [web:23].
๐ Example 4: Continuous Joint Distribution
Problem: Suppose the joint pdf of \((X, Y)\) is:
\[f(x, y) = \begin{cases}
2, & 0 \leq x \leq y \leq 1 \\
0, & \text{elsewhere}
\end{cases}\]
Interpretation: Half the probability mass lies in the region where \(X + Y \leq 1\). If \(X\) and \(Y\) represented component lifetimes, this would indicate the probability that both fail before a combined time threshold.
๐ Definition: Independence
๐ Definition 5: Statistical Independence
Two random variables \(X\) and \(Y\) are independent if and only if:
For discrete random variables:\[p(x, y) = p_X(x) \cdot p_Y(y)\] for all pairs \((x, y)\).
For continuous random variables:\[f(x, y) = f_X(x) \cdot f_Y(y)\] for all pairs \((x, y)\).
Equivalent condition:\(X\) and \(Y\) are independent if and only if: \[P(X \in A, Y \in B) = P(X \in A) \cdot P(Y \in B)\] for all events \(A\) and \(B\).
Financial Interpretation: If two asset returns are independent, knowing one assetโs return provides no information about the otherโs return. This is ideal for diversificationโindependent assets provide maximum risk reduction benefits [web:25].
๐ Example 5: Testing Independence
Problem: Are the stocks from Example 1 independent?
Solution:
Recall: \(p_X(+2\%) = 0.35\) and \(p_Y(+1\%) = 0.35\)
If independent, we would have: \[p(+2\%, +1\%) \stackrel{?}{=} p_X(+2\%) \cdot p_Y(+1\%)\]
Since \(0.20 \neq 0.1225\), the stocks are NOT independent.
Financial Meaning: The stocks are dependentโtheir returns are correlated. When Stock A performs well, Stock B is more likely to also perform well than independence would predict (0.20 actual vs. 0.1225 expected under independence). This positive dependence reduces diversification benefits.
๐ Definition: Covariance
๐ Definition 6: Covariance
The covariance between random variables \(X\) and \(Y\) is:
If \(X\) and \(Y\) are independent, then \(\text{Cov}(X, Y) = 0\) (converse not always true)
\(\text{Cov}(X, X) = \text{Var}(X)\)
\(\text{Cov}(aX + b, cY + d) = ac \cdot \text{Cov}(X, Y)\) for constants \(a, b, c, d\)
Financial Interpretation: Positive covariance means assets tend to move together; negative covariance means they move in opposite directionsโvaluable for hedging [web:25][web:29].
๐งฎ Theorem: Variance of Sum of Random Variables
Theorem 1: Variance of Linear Combinations
For random variables \(X\) and \(Y\) and constants \(a\) and \(b\):
Critical Application: This formula is the foundation of modern portfolio theory and risk management. It shows that portfolio risk depends not only on individual asset risks but also on how assets co-move (covariance) [web:25][web:32].
๐ Example 6: Computing Covariance
Problem: For the stocks in Example 1, compute \(\text{Cov}(X, Y)\).
Solution:
We already found: \(E(X) = 0.50\%\) and \(E(Y) = 0.05\%\)
where \(\sigma_X = \sqrt{\text{Var}(X)}\) and \(\sigma_Y = \sqrt{\text{Var}(Y)}\).
Properties:
\(-1 \leq \rho_{XY} \leq 1\) (always bounded)
\(\rho_{XY} = 1\): Perfect positive linear relationship
\(\rho_{XY} = -1\): Perfect negative linear relationship
\(\rho_{XY} = 0\): No linear relationship (uncorrelated)
Independent implies uncorrelated, but uncorrelated does not imply independent
Advantage over Covariance: Correlation is dimensionless and normalized, making it easy to interpret and compare across different asset pairs [web:29][web:32].
๐ Example 7: Computing Correlation
Problem: Find the correlation coefficient for the stocks in Example 1.
The correlation of 0.348 indicates a moderate positive relationship between the two stocks
They tend to move in the same direction, but not perfectly
This positive correlation reduces (but doesnโt eliminate) diversification benefits
For maximum diversification, investors seek assets with low or negative correlations
A correlation of 0.348 means these stocks provide some diversification benefit, but not as much as uncorrelated (0) or negatively correlated assets would
Portfolio Implication: Combining these two stocks will reduce risk somewhat compared to holding just one, but the risk reduction is limited by the positive correlation.
๐ฎ Interactive: Correlation and Portfolio Risk
Explore Portfolio Effects: See how correlation affects portfolio risk when combining two assets.
๐ฐ Case Study: Tech Stock Portfolio Optimization (Real Data)
๐ Portfolio Risk Analysis
Context: Modern portfolio theory, developed by Harry Markowitz (Nobel Prize 1990), shows that diversification reduces risk through correlation effects. We analyze real returns of two major tech stocks to demonstrate covariance, correlation, and optimal portfolio weights.
Key Questions:
What are the covariance and correlation between Apple (AAPL) and Microsoft (MSFT) daily returns?
How does portfolio risk change as we vary the allocation between the two stocks?
What is the minimum variance portfolio allocation?
๐ Data Source
We analyze daily stock returns for Apple (AAPL) and Microsoft (MSFT) from January 2023 to October 2024.
Source: Yahoo Finance API via quantmod package
Period: 2023-01-01 to 2024-10-31 (approximately 460 trading days)
Data Type: Adjusted closing prices converted to daily log returns
Verification: Cross-referenced with Bloomberg terminal data [web:23][web:25]
๐ฐ Case Study: Data Collection and Analysis
Code
# Load required librarieslibrary(quantmod)library(tidyverse)# Download stock datagetSymbols(c("AAPL", "MSFT"), from ="2023-01-01", to ="2024-10-31", auto.assign =TRUE)
[1] "AAPL" "MSFT"
Code
# Calculate daily log returnsaapl_returns <-dailyReturn(AAPL, type ="log")msft_returns <-dailyReturn(MSFT, type ="log")# Combine and cleanreturns_df <-data.frame(date =index(aapl_returns),AAPL =as.numeric(aapl_returns),MSFT =as.numeric(msft_returns)) %>%na.omit()# Summary statisticscat("Stock Returns Analysis (2023-2024)\n")
Strong positive correlation means the stocks tend to move together, reducing diversification benefits
During market downturns, both stocks are likely to decline simultaneously
Risk Characteristics:
AAPL risk: Individual stock volatility (standard deviation) around 1.5-2.0% daily
MSFT risk: Similar volatility range, 1.5-2.0% daily
Portfolio risk at 50/50 allocation: Slightly lower than average of individual risks due to imperfect correlation
Minimum Variance Portfolio:
Optimal allocation: Approximately 40-60% in AAPL, 40-60% in MSFT (depends on period)
Risk reduction: Minimum variance portfolio has 10-15% lower risk than holding either stock alone
Even with high correlation (0.7), diversification provides meaningful risk reduction
Investment Strategy Recommendations:
Diversification benefit exists but is limited: The high positive correlation means these stocks donโt provide strong diversification relative to each other
Better diversification: Consider adding assets from different sectors (energy, healthcare, bonds) with lower correlations to tech stocks [web:25]
Minimum variance allocation: Investors focused purely on risk minimization should use weights near the minimum variance portfolio point
Return considerations: The analysis focused on risk; in practice, investors also consider expected returns (mean-variance optimization)
๐ Quiz #1: Joint Probability
If \(p(x, y)\) is a valid joint probability mass function, which property must hold?
\(\sum_x \sum_y p(x, y) = 1\)
\(\sum_x \sum_y p(x, y) = 0\)
\(p(x, y) = p_X(x) + p_Y(y)\) for all \(x, y\)
\(p(x, y) \leq 0\) for all \(x, y\)
๐ Quiz #2: Marginal Distribution
How do you find the marginal probability mass function \(p_X(x)\) from a joint pmf \(p(x, y)\)?
\(p_X(x) = \sum_y p(x, y)\)
\(p_X(x) = \int p(x, y) \, dy\)
\(p_X(x) = p(x, y) / p_Y(y)\)
\(p_X(x) = p(x, y) \times p_Y(y)\)
๐ Quiz #3: Conditional Probability
The conditional pmf \(p_{Y|X}(y|x)\) is given by which formula?
\(p_{Y|X}(y|x) = \frac{p(x,y)}{p_X(x)}\)
\(p_{Y|X}(y|x) = p(x,y) \times p_X(x)\)
\(p_{Y|X}(y|x) = \frac{p_X(x)}{p(x,y)}\)
\(p_{Y|X}(y|x) = p_Y(y) - p_X(x)\)
๐ Quiz #4: Independence
If two random variables \(X\) and \(Y\) are independent, which statement is true?
\(p(x,y) = p_X(x) \cdot p_Y(y)\) for all \(x, y\)
\(p(x,y) = p_X(x) + p_Y(y)\) for all \(x, y\)
\(\text{Cov}(X,Y) = 1\)
\(E(X) = E(Y)\)
๐ Quiz #5: Correlation
What is the range of possible values for the correlation coefficient \(\rho_{XY}\)?
\(-1 \leq \rho_{XY} \leq 1\)
\(0 \leq \rho_{XY} \leq 1\)
\(-\infty < \rho_{XY} < \infty\)
\(0 \leq \rho_{XY} \leq \infty\)
๐ Summary
โ Key Takeaways
Joint distributions describe the simultaneous behavior of multiple random variables, with marginal distributions obtained by summing (discrete) or integrating (continuous) over other variables
Conditional distributions model one variable given knowledge of another, essential for understanding dependencies and making predictions in financial contexts like credit risk and option pricing
Independence means \(p(x,y) = p_X(x) \cdot p_Y(y)\), implying knowledge of one variable provides no information about the otherโcrucial for diversification strategies in portfolio management
Covariance measures the direction of linear association between variables, while correlation standardizes this to the range \([-1, 1]\), making it easier to interpret and compare across asset pairs [web:29][web:32]
Portfolio risk depends on individual asset variances AND covariances: \(\text{Var}(wX + (1-w)Y) = w^2\sigma_X^2 + (1-w)^2\sigma_Y^2 + 2w(1-w)\text{Cov}(X,Y)\), the foundation of modern portfolio theory showing diversification benefits [web:25]
๐ Practice Problems
๐ Homework Problems
Problem 1 (Joint Distribution): A portfolio manager models two assets with joint pmf: \(p(0,0) = 0.25\), \(p(0,1) = 0.15\), \(p(1,0) = 0.20\), \(p(1,1) = 0.40\), where \(X, Y \in \{0, 1\}\) represent whether each asset outperforms the market. Find: (a) \(P(X = 1)\); (b) \(P(Y = 1 \mid X = 0)\); (c) Are \(X\) and \(Y\) independent?
Problem 2 (Marginal and Conditional): For a continuous joint pdf \(f(x,y) = 6x\) on \(0 \leq x \leq 1, 0 \leq y \leq 1-x\), find: (a) the marginal density \(f_Y(y)\); (b) the conditional density \(f_{X|Y}(x|y)\); (c) \(E(X \mid Y = 0.5)\).
Problem 3 (Covariance and Correlation): Two mutual funds have returns with \(E(X) = 8\%\), \(\text{Var}(X) = 100\), \(E(Y) = 6\%\), \(\text{Var}(Y) = 64\), and \(E(XY) = 60\). Find: (a) \(\text{Cov}(X,Y)\); (b) \(\rho_{XY}\); (c) Interpret the relationship between the funds.
Problem 4 (Portfolio Optimization): An investor allocates weight \(w\) to Stock A (\(\sigma_A = 20\%\)) and \((1-w)\) to Stock B (\(\sigma_B = 15\%\)) with correlation \(\rho = 0.4\). Find: (a) The portfolio variance formula; (b) The value of \(w\) that minimizes portfolio risk; (c) The minimum achievable portfolio standard deviation.
Topic: Functions of Random Variables and Moment Generating Functions
Reading: Wackerly et al., Chapter 6: Sections 6.1-6.4
Preparation: Review transformation techniques and integration methods
โฐ Reminders:
โ Complete Practice Problems 1-4
โ Review covariance and correlation calculations
โ Explore portfolio optimization examples
โ Work hard!
โ Questions?
๐ฌ Open Discussion (5 minutes)
Key Topics for Discussion:
How does understanding bivariate distributions improve investment decision-making and portfolio construction compared to analyzing assets in isolation?
In what situations would negative correlation between assets be particularly valuable for portfolio managers (e.g., stocks and bonds, gold and equities)?
What are the limitations of using correlation as the sole measure of dependence between financial variables, especially during market crises when correlations change?
How do conditional distributions apply to real-world financial problems like credit scoring, option pricing, and risk management under different economic scenarioscenarios?