class: title-slide <br> <br> .right-panel[ # The Gamma-Poisson Models ## Dr. Mine Dogucu Examples from [bayesrulesbook.com](https://bayesrulesbook.com) ] --- class: middle ## Choosing a Prior Prior distribution depends on our current information. When choosing a prior we may consider - computational ease - interpretability --- ## Conjugate Prior Let the prior model for parameter `\(\theta\)` have pdf `\(f(\theta)\)` and the model of data `\(Y\)` conditioned on `\(\theta\)` have likelihood function `\(L(\theta|y)\)`. If the resulting posterior model with pdf `\(f(\theta|y) \propto f(\theta)L(\theta|y)\)` is of the same model family as the prior, then we say this is a __conjugate prior__. -- __Examples__ The Beta-Binomial Model - Beta is a conjugate prior for the Binomial Likelihood. -- The Gamma-Poisson Model -- The Normal-Normal Model --- ## Non-Conjugate Prior for Binomial Likelihood $$f(\pi)=e-e^\pi\; \text{ for } \pi \in [0,1] $$ -- - Is this a valid pdf? `\(f(\pi)\)` is non-negative on the support of `\(\pi\)`.
<img src="slide-04a-gamma-poisson_files/figure-html/unnamed-chunk-3-1.png" style="display: block; margin: auto;" /> --- ## Non-Conjugate Prior for Binomial Likelihood $$f(\pi)=e-e^\pi\; \text{ for } \pi \in [0,1] $$ -- - Is this a valid pdf? `\(\int_0^1 f(\pi) \stackrel{?}{=} 1\)` -- `\(\int_0^1 e-e^\pi\ d\pi=(e\pi - e^\pi)|_0^1 = [e-0] -[e-1]= 1\)` -- `\(\int_0^1 f(\pi) = 1\)`
--- ## Getting to the posterior model Assume `\(Y = 10\)` and `\(n = 50\)` `\(L(\pi | (y=10)) = {50 \choose 10} \pi^{10} (1-\pi)^{40} \; \; \text{ for } \pi \in [0,1] \; .\)` -- `\(f(\pi | (y = 10)) \propto f(\pi) L(\pi | (y = 10)) = (e-e^\pi) \cdot \binom{50}{10} \pi^{10} (1-\pi)^{40}.\)` -- `\(f(\pi | y ) \propto (e-e^\pi) \pi^{10} (1-\pi)^{40}.\)` -- `\(f(\pi|y=10)= \frac{(e-e^\pi) \pi^{10} (1-\pi)^{40}}{\int_0^1(e-e^\pi) \pi^{10} (1-\pi)^{40}d\pi} \; \; \text{ for } \pi \in [0,1].\)` We would need to integrate to calculate this posterior model, and integrate again for its mean, mode, and variance. --- ## The Question We are interested in modeling `\(\lambda\)` the rate of fraud risk calls received per day. We plan on collecting data on the number of fraud risk phone calls received each day. --- __The Poisson model__ Let random variable `\(Y\)` be the _number of independent events_ that occur in a fixed amount of time or space, where `\(\lambda > 0\)` is the rate at which these events occur. Then the _dependence_ of `\(Y\)` on __parameter__ `\(\lambda\)` can be modeled by the Poisson. In mathematical notation: $$Y | \lambda \sim \text{Pois}(\lambda) $$ -- The Poisson model is specified by a conditional pmf: `\begin{equation} f(y|\lambda) = \frac{\lambda^y e^{-\lambda}}{y!}\;\; \text{ for } y \in \{0,1,2,\ldots\} \end{equation}` -- A Poisson random variable `\(Y\)` is assumed to have equal mean and variance, `\begin{equation} E(Y|\lambda) = \text{Var}(Y|\lambda) = \lambda \; \end{equation}` --- __Joint likelihood function__ Let `\((Y_1,Y_2,\ldots,Y_n)\)` be an independent sample of random variables and `\(\vec{y} = (y_1,y_2,\ldots,y_n)\)` be the corresponding vector of observed values. `\begin{equation} L(\lambda | \vec{y}) = \prod_{i=1}^n L(\lambda | y_i) = L(\lambda | y_1) \cdot L(\lambda | y_2) \cdots L(\lambda | y_n) \; \end{equation}` -- `\begin{equation} L(\lambda | \vec{y}) = \prod_{i=1}^{n}f(y_i | \lambda) = \prod_{i=1}^{n}\frac{\lambda^{y_i}e^{-\lambda}}{y_i!} \;\; \text{ for } \; \lambda > 0 \; \end{equation}` --- __Joint likelihood function__ `$$\begin{split} L(\lambda | \vec{y}) & = \prod_{i=1}^{n}\frac{\lambda^{y_i}e^{-\lambda}}{y_i!} \\ & = \frac{\lambda^{y_1}e^{-\lambda}}{y_1!} \cdot \frac{\lambda^{y_2}e^{-\lambda}}{y_2!} \cdots \frac{\lambda^{y_n}e^{-\lambda}}{y_n!} \\ & =\frac{\lambda^{\sum y_i}e^{-n\lambda}}{\prod_{i=1}^n y_i!} \\ \end{split}$$` --- ## Poisson Likelihood We collect four days of data and receive 6, 2, 2, 1 spam calls each day. Write out the likelihood model. `$$L(\lambda | \vec{y}) =\frac{\lambda^{\sum y_i}e^{-n\lambda}}{\prod_{i=1}^n y_i!}$$` -- `$$L(\lambda | \vec{y}) = \frac{\lambda^{6 +2+2+1}e^{-4\lambda}}{6!\times2!\times2!\times1!} \propto \lambda^{11}e^{-4\lambda} \; .$$` --- ```r plot_poisson_likelihood(y = c(6, 2, 2, 1), lambda_upper_bound = 10) ``` <img src="slide-04a-gamma-poisson_files/figure-html/unnamed-chunk-4-1.png" style="display: block; margin: auto;" /> --- ## Gamma Prior Let `\(\lambda\)` be a random variable which can take any value between 0 and `\(\infty\)`, ie. `\(\lambda \in [0,\infty)\)`. Then the variability in `\(\lambda\)` might be well modeled by a Gamma model with __shape parameter__ `\(s > 0\)` and __rate parameter__ `\(r > 0\)`: `$$\lambda \sim \text{Gamma}(s, r)$$` -- The Gamma model is specified by continuous pdf `\begin{equation} f(\lambda) = \frac{r^s}{\Gamma(s)} \lambda^{s-1} e^{-r\lambda} \;\; \text{ for } \lambda > 0 \end{equation}` where constant `\(\Gamma(s) = \int_0^\infty z^{s - 1} e^{-z}dz\)`. When `\(s\)` is a positive integer, `\(s \in \{1,2,3,\ldots\}\)`, this constant simplifies to `\(\Gamma(s) = (s - 1)!\)`. --- ## The Exponential model `$$\lambda \sim \text{Exp}(r)$$` is a special case of the Gamma with shape parameter `\(s = 1\)`, `\(\text{Gamma}(1,r)\)`. --- <img src="slide-04a-gamma-poisson_files/figure-html/unnamed-chunk-5-1.png" width="55%" style="display: block; margin: auto;" /> --- ## Trend and Variability `\begin{split} E(\lambda) & = \frac{s}{r} \\ \text{Mode}(\lambda) & = \frac{s - 1}{r} \;\; \text{ for } s \ge 1 \\ \text{Var}(\lambda) & = \frac{s}{r^2} \\ \end{split}` --- ## Gamma Model What is `\(f(\lambda)\)` if `\(\lambda =1\)` and `\(\lambda \sim \text{Gamma}(2, 5)\)` ? ```r plot_gamma(shape = 2, rate = 5) ``` -- .pull-left[ <img src="slide-04a-gamma-poisson_files/figure-html/unnamed-chunk-7-1.png" style="display: block; margin: auto;" /> ] .pull-right[ ```r dgamma(x = 1, shape = 2, rate = 5) ``` ``` ## [1] 0.1684487 ``` ] --- ## Tuning Gamma Prior _Before_ we collected any data, our guess was that the rate of fraud risk calls was most likely around 5 calls per day, but could also reasonably range between 2 and 7 calls per day. -- In other words, `\(E(\lambda) = 5\)` and `\(\lambda\)` most likely is between 2 and 7. You can use `plot_gamma()` function to try out different gamma distributions. -- `$$E(\lambda) = \frac{s}{r} \approx 5 \; .$$` --- ## Tuning Gamma Prior ```r plot_gamma(10,2) ``` <img src="slide-04a-gamma-poisson_files/figure-html/unnamed-chunk-9-1.png" style="display: block; margin: auto;" /> --- ## The Posterior Model `$$f(\lambda|\vec{y}) \propto f(\lambda)L(\lambda|\vec{y}) = \frac{r^s}{\Gamma(s)} \lambda^{s-1} e^{-r\lambda} \cdot \frac{\lambda^{\sum y_i}e^{-n\lambda}}{\prod y_i!} \;\;\; \text{ for } \lambda > 0.$$` -- `$$\begin{split} f(\lambda|\vec{y}) & \propto \lambda^{s-1} e^{-r\lambda} \cdot \lambda^{\sum y_i}e^{-n\lambda} \\ & = \lambda^{s + \sum y_i - 1} e^{-(r+n)\lambda} \\ \end{split}$$` -- $$ \lambda|\vec{y} \; \sim \; \text{Gamma}\bigg(s + \sum y_i, r + n \bigg) \; .$$ --- __The Gamma-Poisson model__ Let `\(\lambda > 0\)` be an unknown _rate_ parameter and `\((Y_1,Y_2,\ldots,Y_n)\)` be an independent `\(\text{Pois}(\lambda)\)` sample. The Gamma-Poisson Bayesian model complements the Poisson structure of data `\(Y\)` with a Gamma prior on `\(\lambda\)`: `$$\begin{split} Y_i | \lambda & \stackrel{ind}{\sim} \text{Pois}(\lambda) \\ \lambda & \sim \text{Gamma}(s, r) \\ \end{split}$$` Upon observing data `\(\vec{y} = (y_1,y_2,\ldots,y_n)\)`, the posterior model of `\(\lambda\)` is also a Gamma with updated parameters: `\begin{equation} \lambda|\vec{y} \; \sim \; \text{Gamma}(s + \sum y_i, \; r + n) \; . \end{equation}` --- __The Gamma-Poisson model__ ```r plot_gamma_poisson(shape = 10, rate = 2, sum_y = 11, n = 4) ``` <img src="slide-04a-gamma-poisson_files/figure-html/unnamed-chunk-10-1.png" style="display: block; margin: auto;" />