The notes for this lecture are derived from Chapter 5 of the Bayes Rules! book
Prior distribution depends on our current information. When choosing a prior we may consider
Let the prior model for parameter \(\theta\) have pdf \(f(\theta)\) and the model of data \(Y\) conditioned on \(\theta\) have likelihood function \(L(\theta|y)\). If the resulting posterior model with pdf \(f(\theta|y) \propto f(\theta)L(\theta|y)\) is of the same model family as the prior, then we say this is a conjugate prior.
Examples
The Beta-Binomial Model - Beta is a conjugate prior for the Binomial Likelihood.
The Gamma-Poisson Model
The Normal-Normal Model
\[f(\pi)=e-e^\pi\; \text{ for } \pi \in [0,1] \] Is this a valid pdf?
\(f(\pi)\) is non-negative on the support of \(\pi\).
\[f(\pi)=e-e^\pi\; \text{ for } \pi \in [0,1] \]
Is this a valid pdf?
\(\int_0^1 f(\pi) \stackrel{?}{=} 1\)
\(\int_0^1 e-e^\pi\ d\pi=(e\pi - e^\pi)|_0^1 = [e-e] -[0-e^0]= 1\)
\(\int_0^1 f(\pi) = 1\)
Assume \(Y = 10\) and \(n = 50\)
\(L(\pi | (y=10)) = {50 \choose 10} \pi^{10} (1-\pi)^{40} \; \; \text{ for } \pi \in [0,1] \; .\)
\(f(\pi | (y = 10)) \propto f(\pi) L(\pi | (y = 10)) = (e-e^\pi) \cdot \binom{50}{10} \pi^{10} (1-\pi)^{40}.\)
\(f(\pi | y ) \propto (e-e^\pi) \pi^{10} (1-\pi)^{40}.\)
\(f(\pi|y=10)= \frac{(e-e^\pi) \pi^{10} (1-\pi)^{40}}{\int_0^1(e-e^\pi) \pi^{10} (1-\pi)^{40}d\pi} \; \; \text{ for } \pi \in [0,1].\)
We would need to integrate to calculate this posterior model, and integrate again for its mean, mode, and variance.
We are interested in modeling \(\lambda\) the rate of fraud risk calls received per day. We plan on collecting data on the number of fraud risk phone calls received each day.
Let random variable \(Y\) be the number of independent events that occur in a fixed amount of time or space, where \(\lambda > 0\) is the rate at which these events occur. Then the dependence of \(Y\) on parameter \(\lambda\) can be modeled by the Poisson. In mathematical notation:
\[Y | \lambda \sim \text{Pois}(\lambda) \]
The Poisson model is specified by a conditional pmf:
\[\begin{equation} f(y|\lambda) = \frac{\lambda^y e^{-\lambda}}{y!}\;\; \text{ for } y \in \{0,1,2,\ldots\} \end{equation}\]
A Poisson random variable \(Y\) is assumed to have equal mean and variance,
\[\begin{equation} E(Y|\lambda) = \text{Var}(Y|\lambda) = \lambda \; \end{equation}\]
Let \((Y_1,Y_2,\ldots,Y_n)\) be an independent sample of random variables and \(\vec{y} = (y_1,y_2,\ldots,y_n)\) be the corresponding vector of observed values.
\[\begin{equation} L(\lambda | \vec{y}) = \prod_{i=1}^n L(\lambda | y_i) = L(\lambda | y_1) \cdot L(\lambda | y_2) \cdots L(\lambda | y_n) \; \end{equation}\]
\[\begin{equation} L(\lambda | \vec{y}) = \prod_{i=1}^{n}f(y_i | \lambda) = \prod_{i=1}^{n}\frac{\lambda^{y_i}e^{-\lambda}}{y_i!} \;\; \text{ for } \; \lambda > 0 \; \end{equation}\]
\[\begin{split} L(\lambda | \vec{y}) & = \prod_{i=1}^{n}\frac{\lambda^{y_i}e^{-\lambda}}{y_i!} \\ & = \frac{\lambda^{y_1}e^{-\lambda}}{y_1!} \cdot \frac{\lambda^{y_2}e^{-\lambda}}{y_2!} \cdots \frac{\lambda^{y_n}e^{-\lambda}}{y_n!} \\ & =\frac{\lambda^{\sum y_i}e^{-n\lambda}}{\prod_{i=1}^n y_i!} \\ \end{split}\]
We collect four days of data and receive 6, 2, 2, 1 spam calls each day. Write out the likelihood model.
\[L(\lambda | \vec{y}) =\frac{\lambda^{\sum y_i}e^{-n\lambda}}{\prod_{i=1}^n y_i!}\]
\[L(\lambda | \vec{y}) = \frac{\lambda^{6 +2+2+1}e^{-4\lambda}}{6!\times2!\times2!\times1!} \propto \lambda^{11}e^{-4\lambda} \; .\]
Let \(\lambda\) be a random variable which can take any value between 0 and \(\infty\), ie. \(\lambda \in [0,\infty)\). Then the variability in \(\lambda\) might be well modeled by a Gamma model with shape parameter \(s > 0\) and rate parameter \(r > 0\):
\[\lambda \sim \text{Gamma}(s, r)\]
The Gamma model is specified by continuous pdf
\[\begin{equation} f(\lambda) = \frac{r^s}{\Gamma(s)} \lambda^{s-1} e^{-r\lambda} \;\; \text{ for } \lambda > 0 \end{equation}\]
where constant \(\Gamma(s) = \int_0^\infty z^{s - 1} e^{-z}dz\). When \(s\) is a positive integer, \(s \in \{1,2,3,\ldots\}\), this constant simplifies to \(\Gamma(s) = (s - 1)!\).
\[\lambda \sim \text{Exp}(r)\]
is a special case of the Gamma with shape parameter \(s = 1\), \(\text{Gamma}(1,r)\).
What is \(f(\lambda)\) if \(\lambda =1\) and \(\lambda \sim \text{Gamma}(2, 5)\) ?
Before we collected any data, our guess was that the rate of fraud risk calls was most likely around 5 calls per day, but could also reasonably range between 2 and 7 calls per day.
In other words, \(E(\lambda) = 5\) and \(\lambda\) most likely is between 2 and 7. You can use plot_gamma()
function to try out different gamma distributions.
\[E(\lambda) = \frac{s}{r} \approx 5 \; .\]
\[f(\lambda|\vec{y}) \propto f(\lambda)L(\lambda|\vec{y}) = \frac{r^s}{\Gamma(s)} \lambda^{s-1} e^{-r\lambda} \cdot \frac{\lambda^{\sum y_i}e^{-n\lambda}}{\prod y_i!} \;\;\; \text{ for } \lambda > 0.\]
\[\begin{split} f(\lambda|\vec{y}) & \propto \lambda^{s-1} e^{-r\lambda} \cdot \lambda^{\sum y_i}e^{-n\lambda} \\ & = \lambda^{s + \sum y_i - 1} e^{-(r+n)\lambda} \\ \end{split}\]
\[ \lambda|\vec{y} \; \sim \; \text{Gamma}\bigg(s + \sum y_i, r + n \bigg) \; .\]
Let \(\lambda > 0\) be an unknown rate parameter and \((Y_1,Y_2,\ldots,Y_n)\) be an independent \(\text{Pois}(\lambda)\) sample. The Gamma-Poisson Bayesian model complements the Poisson structure of data \(Y\) with a Gamma prior on \(\lambda\):
\[\begin{split} Y_i | \lambda & \stackrel{ind}{\sim} \text{Pois}(\lambda) \\ \lambda & \sim \text{Gamma}(s, r) \\ \end{split}\]
Upon observing data \(\vec{y} = (y_1,y_2,\ldots,y_n)\), the posterior model of \(\lambda\) is also a Gamma with updated parameters:
\[\begin{equation} \lambda|\vec{y} \; \sim \; \text{Gamma}(s + \sum y_i, \; r + n) \; . \end{equation}\]
model shape rate mean mode var sd
1 prior 10 2 5.0 4.500000 2.5000000 1.5811388
2 posterior 21 6 3.5 3.333333 0.5833333 0.7637626