Posterior Inference

<br>
<br>
.right-panel[

# Posterior Inference
## Dr. Mine Dogucu
Examples from [bayesrulesbook.com](https://bayesrulesbook.com)

]

---

```r
library(bayesrules)
data(moma_sample)
glimpse(moma_sample)
```

```
## Rows: 100
## Columns: 10
## $ artist            <fct> Ad Gerritsen, Kirstine Roepstorff, Lisa Baumgardner,…
## $ country           <fct> dutch, danish, american, american, american, canadia…
## $ birth             <fct> 1940, 1972, 1958, 1952, 1946, 1927, 1901, 1941, 1920…
## $ death             <fct> 2015, NA, 2015, NA, NA, 1966, 1971, NA, 2007, NA, NA…
## $ alive             <lgl> FALSE, TRUE, FALSE, TRUE, TRUE, FALSE, FALSE, TRUE, …
## $ genx              <lgl> FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALS…
## $ gender            <fct> male, female, female, male, male, male, male, male, …
## $ count             <int> 1, 3, 2, 1, 1, 8, 2, 1, 1, 5, 1, 2, 1, 1, 21, 16, 1,…
## $ year_acquired_min <fct> 1981, 2005, 2016, 2001, 2012, 2008, 2018, 1981, 1949…
## $ year_acquired_max <fct> 1981, 2005, 2016, 2001, 2012, 2008, 2018, 1981, 1949…
```

---

`$$Y|\pi  \sim \text{Bin}(100, \pi)$$`

`$$\pi  \sim \text{Beta}(4, 6)$$`

---

```r
moma_sample %>% 
  mutate(genx = (birth >= 1965)) %>% 
  tabyl(genx)
```

```
## Warning in Ops.factor(birth, 1965): '>=' not meaningful for factors
```

```
##  genx   n percent valid_percent
##    NA 100       1            NA
```

---

`$$\begin{split}
Y | \pi & \sim \text{Bin}(100, \pi) \\
\pi & \sim \text{Beta}(4, 6) \\
\end{split} \;\;\;\; \Rightarrow \;\;\;\; \pi | (Y = 14) \sim \text{Beta}(18, 92)$$`

---

```r
plot_beta_binomial(alpha = 4, beta = 6, y = 14, n = 100)
```

---

---

```r
summarize_beta_binomial(4, 6, y = 14, n = 100)
```

```
##       model alpha beta      mean      mode         var         sd
## 1     prior     4    6 0.4000000 0.3750000 0.021818182 0.14770979
## 2 posterior    18   92 0.1636364 0.1574074 0.001232969 0.03511365
```

---

## 95% Posterior Credible Interval (CI)

0.025th & 0.975th quantiles of the Beta(18,92) posterior

```r
qbeta(c(0.025, 0.975), 18, 92)
```

```
## [1] 0.1009084 0.2379286
```

`$\int_{0.1}^{0.24} f(\pi|(y=14)) d\pi  \; \approx \; 0.95$`

---

---

`$$\begin{split}
H_0: & \; \; \pi \ge 0.20 \\
H_a: & \; \; \pi < 0.20 \\
\end{split}$$`

---

---

```r
# Posterior probability that pi < 0.20
post_prob <- pbeta(0.20, 18, 92)
post_prob
```

```
## [1] 0.8489856
```

---

$$\text{Posterior odds } = \frac{P(H_a \; | \; (Y=14))}{P(H_0 \; | \; (Y=14))} \approx 5.62 $$

```r
# Posterior odds
post_odds <- post_prob / (1 - post_prob)
post_odds
```

```
## [1] 5.621883
```

---

---

`$$P(\pi<0.20)$$`

```r
prior_prob <- pbeta(0.20, 4, 6)
prior_prob
```

```
## [1] 0.08564173
```

`$$\text{Prior odds } = \frac{P(H_a)}{P(H_0)} \approx 0.093 \; .$$`

```r
prior_odds <- prior_prob / (1 - prior_prob)
prior_odds
```

```
## [1] 0.09366321
```

---

The __Bayes Factor (BF)__ compares the posterior odds to the prior odds, hence provides insight into just how much our understanding about Gen X representation _evolved_ upon observing our sample data:

`$$\text{Bayes Factor} = \frac{\text{Posterior odds }}{\text{Prior odds }}$$`

---

## Bayes Factor

In a hypothesis test of two competing hypotheses, `$H_a$` vs `$H_0$`, the Bayes Factor is an odds ratio for `$H_a$`:

`$$\text{Bayes Factor}
= \frac{\text{Posterior odds}}{\text{Prior odds}}
= \frac{P(H_a | Y) / P(H_0 | Y)}{P(H_a) / P(H_0)}
 \; .$$`

As a ratio, it's meaningful to compare the Bayes Factor (BF)\ to 1.  To this end, consider three possible scenarios:

1. BF = 1:  The plausibility of `$H_a$` _didn't change_ in light of the observed data.
2. BF > 1:  The plausibility of `$H_a$` _increased_ in light of the observed data.  Thus the greater the Bayes Factor, the more convincing the evidence for `$H_a$`.
3. BF < 1:  The plausibility of `$H_a$` _decreased_ in light of the observed data.

---

## Two-sided Tests

`$$\begin{split}
H_0: & \; \; \pi = 0.3 \\
H_a: & \; \; \pi \ne 0.3 \\
\end{split}$$`

`$$\text{Posterior odds } = \frac{P(H_a \; | \; (Y=14))}{P(H_0 \; | \; (Y=14))} = \frac{1}{0} = \text{ nooooo!}$$`

Recall 95% posterior CI

```r
qbeta(c(0.025, 0.975), 18, 92)
```

```
## [1] 0.1009084 0.2379286
```

---

```r
library(rstan)
# STEP 1: DEFINE the model
art_model <- "
  data {
    int<lower=0, upper=100> Y;
  }
  parameters {
    real<lower=0, upper=1> pi;
  }
  model {
    Y ~ binomial(100, pi);
    pi ~ beta(4, 6);
  }
"
# STEP 2: SIMULATE the posterior
art_sim <- stan(model_code = art_model, data = list(Y = 14), 
  chains = 4, iter = 5000*2, seed = 84735)
```

```
## 
## SAMPLING FOR MODEL '40889b2266edb1362fe73358abd137b3' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 1.1e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.021724 seconds (Warm-up)
## Chain 1:                0.021834 seconds (Sampling)
## Chain 1:                0.043558 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL '40889b2266edb1362fe73358abd137b3' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 4e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.021341 seconds (Warm-up)
## Chain 2:                0.023645 seconds (Sampling)
## Chain 2:                0.044986 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL '40889b2266edb1362fe73358abd137b3' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 3e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.021171 seconds (Warm-up)
## Chain 3:                0.021489 seconds (Sampling)
## Chain 3:                0.04266 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL '40889b2266edb1362fe73358abd137b3' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 4e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 10000 [  0%]  (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%]  (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%]  (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%]  (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%]  (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%]  (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%]  (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%]  (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%]  (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%]  (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%]  (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.021516 seconds (Warm-up)
## Chain 4:                0.021631 seconds (Sampling)
## Chain 4:                0.043147 seconds (Total)
## Chain 4:
```

---

```r
library(bayesplot)
# Parallel trace plots & density plots
mcmc_trace(art_sim, pars = "pi", size = 0.5)
```

---

```r
# Combined density plot
mcmc_dens_overlay(art_sim, pars = "pi")
```

---

```r
# Combined density plot
mcmc_dens(art_sim, pars = "pi")
```

---

```r
broom.mixed::tidy(art_sim, conf.int = TRUE, conf.level = 0.95)
```

```
## # A tibble: 1 × 5
##   term  estimate std.error conf.low conf.high
##   <chr>    <dbl>     <dbl>    <dbl>     <dbl>
## 1 pi       0.162    0.0352    0.101     0.239
```

```r
summary(art_sim, pars = "pi")$summary
```

---

```r
mcmc_areas(art_sim, pars = "pi",
  prob = 0.95, point_est = "mean")
```

---

```r
mcmc_areas(art_sim, pars = "pi",
  prob = 0.95, point_est = "mean")
```

---

```r
art_chains_df <- as.data.frame(art_sim, 
  pars ="lp__", include = FALSE)

art_chains_df %>% 
  summarize(post_mean = mean(pi), 
    post_mode = sample_mode(pi),
    lower_95 = quantile(pi, 0.025),
    upper_95 = quantile(pi, 0.975))
```

```
##   post_mean post_mode  lower_95  upper_95
## 1 0.1642121 0.1597545 0.1011255 0.2387877
```

---

```r
art_chains_df %>% 
  mutate(exceeds = pi < 0.20) %>% 
  tabyl(exceeds)
```

```
##  exceeds     n percent
##    FALSE  3080   0.154
##     TRUE 16920   0.846
```

```r
post_prob
```

```
## [1] 0.8489856
```

---

__a Bayesian analysis assesses the uncertainty regarding an unknown parameter `$\pi$` in light of observed data `$Y$`__.

`$$P((\pi < 0.20) \; | \; (Y = 14)) = 0.8489856 \;.$$`

__a frequentist analysis assesses the uncertainty of the observed data `$Y$` in light of assumed values of `$\pi$`.__

`$$P((Y \le 14) | (\pi = 0.20)) = 0.08$$`