```{r}
#| echo: false
library(knitr)
```
# Exercise 13.6 (Hotel bookings: getting started)
Plans change. Hotel room bookings get canceled. In the next exercises, you'll explore whether hotel cancellations might be predicted based upon the circumstances of a reservation. Throughout, utilize weakly informative priors and the `hotel_bookings` data in the `bayesrules` package. Your analysis will incorporate the following variables on hotel booking:
| Variable | Notation | Meaning |
|------------------------|----------|---------------------------------------------------|
| `is_canceled` | Y | whether or not the booking was canceled |
| `lead_time` | X1 | number of days between the booking and arrival |
| `previous_cancellations` | X2 | number of times the guest has canceled a booking |
| `is_repeated_guest` | X3 | whether the guest is a repeat customer |
| `average_daily_rate` | X4 | the average per day cost of the hotel |
```{r echo=FALSE, warning=FALSE, message=FALSE}
# Load packages
library(bayesrules)
library(tidyverse)
library(rstan)
library(rstanarm)
library(bayesplot)
library(tidybayes)
library(janitor)
library(broom.mixed)
```
```{r warning=FALSE}
data("hotel_bookings")
```
a. What proportion of the sample bookings were canceled?
b. Construct and discuss plots of is_canceled vs each of the four potential predictors above.
c. Using formal mathematical notation, specify an appropriate Bayesian regression model of Y by predictors (X1, X2, X3, X4).
d. Explain your choice for the structure of the data model.
# Exercise 13.7 (Hotel bookings: getting started)
a. Simulate the posterior model of your regression parameters ($\beta_{0}, \beta_{1},..., \beta_{4}$). Construct trace plots, density plots, and a pp_check() of the chain output.
b. Report the posterior median model of hotel cancellations on each of the log(odds).
c. Construct 80% posterior credible intervals for your model coefficients. Interpret those for $\beta_{2}$ and $\beta_{3}$ on the odds scale.
d. Among the four predictors, which are significantly associated with hotel cancellations, both statistically and meaningfully? Explain.
# Exercise 13.8 (Hotel bookings: classification rules)
a. How good is your model at anticipating whether a hotel booking will be canceled? Evaluate the classification accuracy using both the in-sample and cross-validation approaches, along with a 0.5 probability cut-off.
b. Interpret the cross-validated overall accuracy, sensitivity, and specificity measures in the context of this analysis.
# Exercise 13.9 (Hotel bookings: will they cancel?!)
a. A guest that is new to a hotel and has only canceled a booking 1 time before, has booked a $100 per day hotel room 30 days in advance. Simulate, plot, and discuss the posterior predictive model of Y , whether or not the guest will cancel this booking.
b. Come up with the features of another fictitious booking that’s more likely to be canceled than the booking in part a. Support your claim by simulating, plotting, and comparing this booking’s posterior predictive model of Y to that in part a.