class: title-slide <br> <br> .right-panel[ # Posterior Prediction ## Dr. Mine Dogucu Examples from [bayesrulesbook.com](https://bayesrulesbook.com) ] --- class: middle <img src="img/posterior-predict.png" width="957" style="display: block; margin: auto;" /> --- ```r head(normal_model_df, 1) ``` ``` ## (Intercept) temp_feel sigma ## 1 -2339.928 84.37713 1289.935 ``` ```r `(Intercept)` <- round(head(normal_model_df, 1)$`(Intercept)`) temp_feel <- round(head(normal_model_df, 1)$temp_feel, 2) sigma <- round(head(normal_model_df, 1)$sigma) ``` --- ```r `(Intercept)` ``` ``` ## [1] -2340 ``` ```r temp_feel ``` ``` ## [1] 84.38 ``` ```r sigma ``` ``` ## [1] 1290 ``` --- Under this particular scenario, `\(\left(\beta_0^{(1)}, \beta_1^{(1)}, \sigma^{(1)}\right) = (-2340, 84.38, 1290)\)`, thus the regression trend is defined by `$$\beta_0^{(1)} + \beta_1^{(1)} X = -2340 + 84.38X \; .$$` ```r y_trend <- `(Intercept)` + temp_feel*75 y_trend ``` ``` ## [1] 3988.5 ``` --- To capture the __sampling variability__ around this plausible trend, ie. the fact that not all 75 degree days have this exact ridership, we can simulate our first official prediction `\(Y_{\text{new}}^{(1)}\)` by taking a random draw from the Normal likelihood specified by this first parameter set: `$$Y_{\text{new}}^{(1)} | \beta_0, \beta_1, \sigma \; \sim \; N\left(3988.5, 1290^2\right) \; .$$` ```r set.seed(84735) y_new <- rnorm(1, y_trend, sigma) y_new ``` ``` ## [1] 4849.23 ``` --- <img src="slide-08b-posterior-prediction_files/figure-html/unnamed-chunk-10-1.png" style="display: block; margin: auto;" /> --- ```r # Predict rides for each parameter set in the chain set.seed(84735) predict_75 <- normal_model_df %>% mutate(y_trend = `(Intercept)` + temp_feel*75) %>% mutate(y_new = rnorm(20000, y_trend, sigma)) ``` -- ```r head(predict_75, 3) ``` ``` ## (Intercept) temp_feel sigma y_trend y_new ## 1 -2339.928 84.37713 1289.935 3988.357 4849.044 ## 2 -2153.331 81.67121 1250.547 3972.010 3817.142 ## 3 -2199.164 81.99670 1313.540 3950.589 4976.290 ``` --- ```r ggplot(predict_75, aes(x = y_trend)) + geom_density() ggplot(predict_75, aes(x = y_new)) + geom_density() ``` <img src="slide-08b-posterior-prediction_files/figure-html/unnamed-chunk-14-1.png" style="display: block; margin: auto;" /> --- ```r # Construct 80% posterior credible intervals predict_75 %>% summarize(lower_trend = quantile(y_trend, 0.1), upper_trend = quantile(y_trend, 0.9), lower_new = quantile(y_new, 0.1), upper_new = quantile(y_new, 0.9)) ``` ``` ## lower_trend upper_trend lower_new upper_new ## 1 3879.099 4044.275 2339.071 5606.899 ``` --- ```r # Simulate a set of predictions set.seed(84735) shortcut_prediction <- posterior_predict( normal_model_sim, newdata = data.frame(temp_feel = 75)) head(shortcut_prediction, 3) ``` ``` ## 1 ## [1,] 4849.044 ## [2,] 3817.142 ## [3,] 4976.290 ``` --- ```r # Plot the approximate predictive model mcmc_dens(shortcut_prediction) + labs(x = "predicted ridership on a 75 degree day") ``` <img src="slide-08b-posterior-prediction_files/figure-html/unnamed-chunk-17-1.png" style="display: block; margin: auto;" /> --- ```r # Construct an 80% posterior credible interval predictions <- data.frame(y_new = shortcut_prediction[,1]) predictions %>% summarize(lower_80 = quantile(y_new, 0.1), upper_80 = quantile(y_new, 0.9)) ``` ``` ## lower_80 upper_80 ## 1 2339.071 5606.899 ```