Processing math: 100%
+ - 0:00:00
Notes for current slide
Notes for next slide



Evaluating Regression Models

Dr. Mine Dogucu

Examples from bayesrulesbook.com

1 / 23
  1. How fair is the model?

  2. How wrong is the model?

  3. How accurate are the posterior predictive models?

2 / 23

Posterior predictive check

Consider a regression model with response variable Y, predictor X, and a set of regression parameters θ. For example, in the model above θ=(β0,β1,σ). Further, let {θ(1),θ(2),,θ(N)} be an N-length Markov chain for the posterior model of θ. Then a "good" Bayesian model will produce predictions of Y with features similar to the original Y data. To evaluate whether your model satisfies this goal:

  1. At each set of posterior plausible parameters θ(i), simulate a sample of Y values from the likelihood model, one corresponding to each X in the original sample of size n. This produces N separate samples of size n.
  2. Compare the features of the N simulated Y samples, or a subset of these samples, to those of the original Y data.
3 / 23
head(normal_model_df, 1)
## (Intercept) temp_feel sigma
## 1 -2339.928 84.37713 1289.935
`(Intercept)` <- round(head(normal_model_df, 1)$`(Intercept)`)
temp_feel <- round(head(normal_model_df, 1)$temp_feel, 2)
sigma <- round(head(normal_model_df, 1)$sigma)
4 / 23
`(Intercept)`
## [1] -2340
temp_feel
## [1] 84.38
sigma
## [1] 1290
5 / 23
set.seed(84735)
one_simulation <- bikes %>%
mutate(simulated_rides = rnorm(
500, mean = -2340 + 84.38 * temp_feel, sd = 1290)) %>%
select(temp_feel, rides, simulated_rides)
6 / 23
set.seed(84735)
one_simulation <- bikes %>%
mutate(simulated_rides = rnorm(
500, mean = -2340 + 84.38 * temp_feel, sd = 1290)) %>%
select(temp_feel, rides, simulated_rides)
head(one_simulation, 3)
## temp_feel rides simulated_rides
## 1 64.72625 654 3982.331
## 2 49.04645 1229 1638.786
## 3 51.09098 1454 2978.376
7 / 23
ggplot(one_simulation, aes(x = simulated_rides)) +
geom_density(color = "lightblue") +
geom_density(aes(x = rides), color = "darkblue")

8 / 23
# Examine 50 of the 20000 simulated samples
set.seed(84735)
bayesplot::pp_check(normal_model_sim, nreps = 50)

9 / 23
bikes %>%
filter(date == "2012-10-22") %>%
select(temp_feel, rides)
## temp_feel rides
## 1 75.46478 6228
10 / 23

observed value: Y
posterior predictive median: Y
predictive error: YY

11 / 23
predict_75 %>%
summarize(
median = median(y_new),
error = 6228 - median(y_new))
## median error
## 1 3946.558 2281.442
12 / 23

median absolute deviation (mad) the typical distance between a posterior prediction and the posterior predictive median.

We estimate the mad by calculating the absolute deviations of our 20,000 posterior predictions of Y, {Y(1)new,Y(2)new,,Y(20000)new}, from the posterior predictive median Y and calculating the median of these deviations:

mad=mediani{1,2,,20000}|Y(i)newY|.

13 / 23
predict_75 %>%
summarize(
error = 6228 - median(y_new),
predictive_mad = mad(y_new, constant = 1),
error_scaled = error / predictive_mad)
## error predictive_mad error_scaled
## 1 2281.442 867.8032 2.628986
14 / 23

Posterior Prediction Interval

predict_75 %>%
summarize(lower_95 = quantile(y_new, 0.025),
lower_50 = quantile(y_new, 0.25),
upper_50 = quantile(y_new, 0.75),
upper_95 = quantile(y_new, 0.975))
## lower_95 lower_50 upper_50 upper_95
## 1 1494.581 3086.11 4822.178 6500.741
15 / 23
set.seed(84735)
predictions <- posterior_predict(normal_model_sim,
newdata = bikes)
dim(predictions)
## [1] 20000 500
16 / 23
ppc_intervals(bikes$rides,
yrep = predictions,
x = bikes$temp_feel,
prob = 0.5, prob_outer = 0.95)

17 / 23

Let Y1,Y2,,Yn denote n observed outcomes. Then each Yi has a corresponding posterior predictive model with median Yi and median absolute deviation madi. We can evaluate the overall posterior predictive model quality by the following measures:

  • The median absolute error mae

    mae=mediani{1,2,,n}|YiYi|

  • The scaled median absolute error scaled_mae

    mae scaled=mediani{1,2,,n}|YiYi|madi

  • within_50 and within_95 measure the proportion of observed values Yi that fall within their 50% and 95% posterior prediction intervals respectively.

18 / 23
# Posterior predictive summaries
prediction_summary(y = bikes$rides,
yrep = predictions)
## mae mae_scaled within_50 within_95
## 1 989.8291 1.144095 0.44 0.968
19 / 23

The k-fold cross validation algorithm

  1. Create folds.
    Let k be some integer from 2 to our original sample size n. Split the data into k folds, or subsets, of roughly equal size.

  2. Train and test the model.

    • Train the model using the first k1 data folds combined.
    • Test this model on the kth data fold.
    • Measure the prediction quality (eg: by MAE).
  3. Repeat.
    Repeat step 2 k1 times, each time leaving out a different fold for testing.

  4. Calculate cross-validation estimates.
    Steps 2 and 3 produce k different training models and k corresponding measures of prediction quality. Average these k measures to obtain a single cross-validation estimate of prediction quality.

20 / 23
set.seed(84735)
cv_procedure <- prediction_summary_cv(
data = bikes, model = normal_model_sim, k = 10)
21 / 23
cv_procedure$folds
## fold mae mae_scaled within_50 within_95
## 1 1 982.3498 1.1347732 0.46 0.98
## 2 2 962.4132 1.1042769 0.40 1.00
## 3 3 956.3402 1.1045559 0.42 0.98
## 4 4 1011.6086 1.1598395 0.46 0.98
## 5 5 1180.3943 1.3649528 0.40 0.96
## 6 6 932.9605 1.0620776 0.46 0.94
## 7 7 1267.7214 1.4886004 0.32 0.96
## 8 8 1117.9843 1.2874389 0.36 1.00
## 9 9 1108.5276 1.3011456 0.40 0.92
## 10 10 789.2742 0.9012649 0.56 0.94
22 / 23
cv_procedure$cv
## mae mae_scaled within_50 within_95
## 1 1030.957 1.190893 0.424 0.966
23 / 23
  1. How fair is the model?

  2. How wrong is the model?

  3. How accurate are the posterior predictive models?

2 / 23
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow