class: center, middle, inverse, title-slide # Teaching and Learning Bayesian Statistics with {bayesrules} ##
bit.ly/dogucu-talks
Southern California R Users Group and R Ladies Irvine ### Mine Dogucu ### 2021-10-26 --- class: middle center <img src="img/headshot.jpeg" alt="A headshot of a woman with curly, short, ear-length hair with green eyes and red lipstick." style="width:165px; margin-top:20px; border: 3px solid whitesmoke; padding: 10px;"> .large[
] <a href = "http://twitter.com/MineDogucu">MineDogucu</a> .large[
] <a href = "http://github.com/mdogucu">mdogucu</a> .large[
] <a href = "http://minedogucu.com">minedogucu.com</a> --- class: middle .pull-left[ .center[ <img src="img/alicia.jpg" alt="A headshot of a woman with long blonde hair wearing a brownish yellow tshirt and a red and pink floral silk scarf wrapped around her neck." style="width:165px; margin-top:20px; border: 3px solid whitesmoke; padding: 10px;"> Alicia Johnson .font-20[Macalester College] [
](https://ajohns24.github.io/portfolio) [
](https://github.com/ajohns24) ] ] .pull-right[ .center[ <img src="img/miles.png" alt="A headshot of a man with short dark hair, and a dark moustache. He is wearing a blue button up shirt and dark gray jacket" style="width:165px; margin-top:20px; border: 3px solid whitesmoke; padding: 10px;"> Miles Ott .font-20[Smith College] [
](https://milesott.com/) [
](https://github.com/MilesOtt) [
](https://twitter.com/Miles_Ott) ] ] --- class:center <img src="img/bayes-rules-hex.png" title="a hex shaped logo with shiny green-pink disco ball and purple starry background. There is text that says Bayes Rules!" alt="a hex shaped logo with shiny green-pink disco ball and purple starry background. There is text that says Bayes Rules!" width="25%" style="display: block; margin: auto;" /> .pull-left[ <script src="https://use.fontawesome.com/releases/v5.15.1/js/all.js" data-auto-replace-svg="nest"></script> <i class="fas fa-book fa-2x" aria-hidden="true" title="Book icon"></i> [Bayes Rules! An Introduction to Bayesian Modeling with R](https://bayesrulesbook.com) ] .pull-right[ <i class="fab fa-r-project fa-2x" aria-hidden="true" title="R logo"></i> [{bayesrules}](https://www.github.com/bayes-rules/bayesrules) ] --- class: center middle ### Who are you? .pull-left[ <img src="img/teacher.png" title="an illustration of a woman holding a book standing in front of a blackboard" alt="an illustration of a woman holding a book standing in front of a blackboard" width="70%" style="display: block; margin: auto;" /> ] -- .pull-right[ <img src="img/student-solo.png" title="an illustration of a heafshot of a man with glasses and a tie with a shirt collar" alt="an illustration of a heafshot of a man with glasses and a tie with a shirt collar" width="40%" style="display: block; margin: auto;" /> ] --- class: middle ### A quick example Let `\(\pi\)` be the proportion of spam emails where `\(\pi \in [0, 1]\)`. -- What do you think `\(\pi\)` is? How certain are you? --- ## Binomial Likelihood ```r plot_binomial_likelihood(y = 3, n = 10) ``` <img src="index_files/figure-html/unnamed-chunk-5-1.png" title="X axis reads pi with values from 0 to 1. Y axis reads l of pi given capital Y = y). The curve of the graph has a high peak on the y-axis when pi equals to 0.3. The y-values are almost zero when pi is greater than 0.75." alt="X axis reads pi with values from 0 to 1. Y axis reads l of pi given capital Y = y). The curve of the graph has a high peak on the y-axis when pi equals to 0.3. The y-values are almost zero when pi is greater than 0.75." style="display: block; margin: auto;" /> --- ## Prior Model .pull-left[ ```r plot_beta(alpha = 4, beta = 4) ``` <img src="index_files/figure-html/unnamed-chunk-6-1.png" title="X axis reads pi with values from 0 to 1. Y axis reads f of pi). The curve of the graph has a high peak on the y-axis when pi equals to 0.5. The distribution is symmentric." alt="X axis reads pi with values from 0 to 1. Y axis reads f of pi). The curve of the graph has a high peak on the y-axis when pi equals to 0.5. The distribution is symmentric." style="display: block; margin: auto;" /> ] .pull-right[ ```r plot_beta(alpha = 1, beta = 10) ``` <img src="index_files/figure-html/unnamed-chunk-7-1.png" title="X axis reads pi with values from 0 to 1. Y axis reads f of pi). The curve of the graph has a high peak on the y-axis when pi equals to 0. The curve is a concave one and y is decreasing as pi is increasing." alt="X axis reads pi with values from 0 to 1. Y axis reads f of pi). The curve of the graph has a high peak on the y-axis when pi equals to 0. The curve is a concave one and y is decreasing as pi is increasing." style="display: block; margin: auto;" /> ] --- ## Posterior Model ```r plot_beta_binomial(alpha = 1, beta = 10, y = 3, n = 10) ``` <img src="index_files/figure-html/unnamed-chunk-8-1.png" title="X axis reads pi with values from 0 to 1. Y axis reads density. Three curves are shown labeled as prior, (scaled) likelihood, and posterior. The prior curve of the graph has a high peak on the y-axis when pi equals to 0. The curve is a concave one and y is decreasing as pi is increasing. The likelihood curve has a high peak on the y-axis when pi equals to 0.3. The y-values are almost zero when pi is greater than 0.75. The posterior sits between the prior and likelihood curves." alt="X axis reads pi with values from 0 to 1. Y axis reads density. Three curves are shown labeled as prior, (scaled) likelihood, and posterior. The prior curve of the graph has a high peak on the y-axis when pi equals to 0. The curve is a concave one and y is decreasing as pi is increasing. The likelihood curve has a high peak on the y-axis when pi equals to 0.3. The y-values are almost zero when pi is greater than 0.75. The posterior sits between the prior and likelihood curves." style="display: block; margin: auto;" /> --- class: middle ### Target Audience of the Book .pull-left[ <img src="img/student.png" title="an illustration of four students with bags who are holding workbooks." alt="an illustration of four students with bags who are holding workbooks." width="100%" style="display: block; margin: auto;" /> ] .pull-right[ - Advanced Undergraduate Students in Statistics / Data Science Programs - Equally trained learners - Prior course/training in statistics is required - Familiarity with probability, calculus, and tidyverse is recommended. ] --- class: middle ## Our Motivation - Bayesian methods are becoming more popular due to computing advances and reevaluation of subjectivity. - Lack of resources for the target audience. --- class: middle <style type="text/css"> .panelset { --panel-tab-foreground: whitesmoke; --panel-tab-active-foreground: whitesmoke; --panel-tabs-border-bottom: #00a1a1; --panel-tab-inactive-opacity: 0.5;} </style> .panelset.sideways[ .panel[.panel-name[Unit 1] ### Bayesian Foundations .pull-right[ - Bayes' Rule - The Beta-Binomial Bayesian Model - Balance and Sequentiality in Bayesian Analysis - Conjugate Families ] .pull-left[ <img src="index_files/figure-html/unnamed-chunk-11-1.png" title="three curves on a single plot with no axis labeled. It is coloring scheme indicates its similarity to the previous plot with prior, scaled likelihood and posterior" alt="three curves on a single plot with no axis labeled. It is coloring scheme indicates its similarity to the previous plot with prior, scaled likelihood and posterior" style="display: block; margin: auto;" /> ] ] .panel[.panel-name[Unit 2] ### Posterior Simulation & Analysis .pull-right[ <img src="img/unit2.png" title="A traceplot with no axis labels. Traceplots have thin vertical lines with varying lengths." alt="A traceplot with no axis labels. Traceplots have thin vertical lines with varying lengths." width="100%" style="display: block; margin: auto;" /> ] .pull-left[ - Grid Approximation - The Metropolis-Hastings Algorithm - Posterior Estimation - Posterior Hypothesis Testing - Posterior Prediction ] ] .panel[.panel-name[Unit 3] ### Regression and Classification .pull-right[ <img src="img/unit3.png" title="A scatterplot with multiple regression lines passing through points. These regression lines are not all over the place, they are clustered with similar but varyin intercepts and slopes." alt="A scatterplot with multiple regression lines passing through points. These regression lines are not all over the place, they are clustered with similar but varyin intercepts and slopes." width="100%" style="display: block; margin: auto;" /> ] .pull-left[ - Normal Regression - Poisson and Negative Binomial Regression - Logistic Regression - Naive Bayes Classification ] ] .panel[.panel-name[Unit 4] ### Hierarchical Models .pull-right[ <img src="img/unit4.png" title="a figure showing hierarchy with a rectangle on top. With a set of arrows pointing downwards leading to a set of rectangles below which also have a set of arrows pointing downwards leading to a different set of rectangles." alt="a figure showing hierarchy with a rectangle on top. With a set of arrows pointing downwards leading to a set of rectangles below which also have a set of arrows pointing downwards leading to a different set of rectangles." width="100%" style="display: block; margin: auto;" /> ] .pull-left[ - Normal hierarchical models without predictors - Normal hierarchical models with predictors - Non-Normal Hierarchical Regression & Classification ] ] ] --- class: center middle .large[Pedagogical Approach] --- class: middle center ## Checking Intuition <img src="img/fake-news-diagram.png" title="There are two ellipses at the top of the image. The first ellipse reads 'Prior: Only 40% of articles are fake'. The second ellipse reads 'Data: Exclamation points are more common among fake news'. There are two arrows each from the upper two ellipses leading to a third ellipse in the lower part of the image. The third ellipse reads 'Posterior: Is the article fake or not?'" alt="There are two ellipses at the top of the image. The first ellipse reads 'Prior: Only 40% of articles are fake'. The second ellipse reads 'Data: Exclamation points are more common among fake news'. There are two arrows each from the upper two ellipses leading to a third ellipse in the lower part of the image. The third ellipse reads 'Posterior: Is the article fake or not?'" width="80%" style="display: block; margin: auto;" /> --- .center[ ## Active Learning .pull-left[ Quizzes [Quiz Yourself](https://www.bayesrulesbook.com/chapter-1.html#quiz-yourself-1) <img src="img/quiz.png" title="An icon with a question mark and for choices labeled as A, B, C, and D." alt="An icon with a question mark and for choices labeled as A, B, C, and D." width="40%" style="display: block; margin: auto;" /> ] .pull-right[ Hands-on Programming [Metropolis-Hastings Algorithm](https://www.bayesrulesbook.com/chapter-7.html#the-metropolis-hastings-algorithm) <img src="img/programming.png" title="An icon of a student sitting on a chair and in front a desk with a computer typing." alt="An icon of a student sitting on a chair and in front a desk with a computer typing." width="40%" style="display: block; margin: auto;" /> ] ] --- class: center middle ### Computing and Math Together .pull-left[ <i class="fas fa-laptop-code fa-6x" aria-hidden="true" title="Laptop icon with code"></i> ] <i class="fas fa-subscript fa-6x" aria-hidden="true" title="X sub 1"></i> --- class: middle center .pull-left[ ### Compute for a Single Case <img src="img/build.png" title="A construction sign with a figure shoveling." alt="A construction sign with a figure shoveling." width="50%" style="display: block; margin: auto;" /> ] .pull-right[ ### Then Use Built-In Functions <i class="fab fa-r-project fa-7x" aria-hidden="true" title="R logo"></i> ] --- class: center middle ### Accessibility and Inclusion <table> <thead> <tr> <th style="text-align:left;"> Accessibility and Inclusion Criteria </th> <th style="text-align:left;"> Questions </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Accessibility </td> <td style="text-align:left;"> Is the cost affordable for learners from diverse socioeconomic backgrounds? </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:left;"> Are plots distinguishable to color blind learners? </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:left;"> Is alt text provided for images? </td> </tr> </tbody> </table> --- class: center middle <table> <thead> <tr> <th style="text-align:left;"> Accessibility and Inclusion Criteria </th> <th style="text-align:left;"> Questions </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Inclusivity of scholars </td> <td style="text-align:left;"> Do the cited scholars represent diversity across identities, experiences, and expertise? </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:left;"> Are scholars cited using the correct names and pronouns? </td> </tr> </tbody> </table> --- class: center middle <table> <thead> <tr> <th style="text-align:left;"> Accessibility and Inclusion Criteria </th> <th style="text-align:left;"> Questions </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Inclusivity of students </td> <td style="text-align:left;"> Do examples avoid the necessity of specialized knowledge? </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:left;"> Do names and pronouns reflect diverse cultural and personal identities? </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:left;"> Are there examples that could potentially speak to younger as well as older students? </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:left;"> Does the delivery embrace mistakes and critical thinking? </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:left;"> Are efforts made to accommodate different academic experiences and create a shared foundation? </td> </tr> </tbody> </table> --- class: center middle [More on accessibility and inclusion is available as a preprint](https://arxiv.org/abs/2110.06355) --- class: center middle .font-75[R packages] --- class: middle center .large[
] <a href = "http://github.com/bayes-rules/bayesrules">library(bayesrules)</a> ```r devtools::install_github("bayes-rules/bayesrules") ``` --- .pull-left[ ```r plot_beta(alpha = 3, beta = 8) ``` <div class="figure" style="text-align: center"> <img src="index_files/figure-html/unnamed-chunk-20-1.png" alt="X axis reads pi with values from 0 to 1. Y axis reads f of pi). The curve of the graph has a high peak on the y-axis when pi equals to 0.25. The y-values are almost zero when pi is greater than 0.70." /> <p class="caption">X axis reads pi with values from 0 to 1. Y axis reads f of pi). The curve of the graph has a high peak on the y-axis when pi equals to 0.25. The y-values are almost zero when pi is greater than 0.70.</p> </div> ] .pull-right[ ```r plot_beta(alpha = 10, beta = 2) ``` <div class="figure" style="text-align: center"> <img src="index_files/figure-html/unnamed-chunk-21-1.png" alt="X axis reads pi with values from 0 to 1. Y axis reads f of pi). The curve of the graph has a high peak on the y-axis when pi equals to 0.9. The y-values are almost zero when pi is less than 0.50." /> <p class="caption">X axis reads pi with values from 0 to 1. Y axis reads f of pi). The curve of the graph has a high peak on the y-axis when pi equals to 0.9. The y-values are almost zero when pi is less than 0.50.</p> </div> ] --- ```r plot_beta_binomial(alpha = 3, beta = 8, y = 19, n = 20) ``` <img src="index_files/figure-html/unnamed-chunk-22-1.png" title="X axis reads pi with values from 0 to 1. Y axis reads density. Three curves are shown labeled as prior, (scaled) likelihood, and posterior. The prior curve of the graph has a high peak on the y-axis when pi equals to 0.25 and y values are close to zero when pi is greater than 0.7. The likelihood curve has a high peak on the y-axis when pi equals to 0.95 and is quite peaked with low variance. The posterior sits between the prior and likelihood curves." alt="X axis reads pi with values from 0 to 1. Y axis reads density. Three curves are shown labeled as prior, (scaled) likelihood, and posterior. The prior curve of the graph has a high peak on the y-axis when pi equals to 0.25 and y values are close to zero when pi is greater than 0.7. The likelihood curve has a high peak on the y-axis when pi equals to 0.95 and is quite peaked with low variance. The posterior sits between the prior and likelihood curves." style="display: block; margin: auto;" /> --- <img src="index_files/figure-html/unnamed-chunk-24-1.png" style="display: block; margin: auto;" /> --- class: middle .pull-left[ ### Plotting Functions `plot_beta()` `plot_binomial_likelihood()` `plot_beta_binomial` `plot_gamma()` `plot_poisson_likelihood()` `plot_gamma_poisson()` `plot_normal()` `plot_normal_likelihood()` `plot_normal_normal()` ] .pull-right[ ### Summary Functions `summarize_beta()` `summarize_beta_binomial()` <br> `summarize_gamma()` `summarize_gamma_poisson()` <br> `summarize_normal_normal()` ] --- class: middle ## Model Evaluation Functions | Functions | Response | Model Type | |------------------------------------------------------------------|--------------|------------| | `prediction_summary()` <br> `prediction_summary_cv()` | Quantitative | rstanreg | | `classification_summary()` `classification_summary_cv()` | Binary | rstanreg | | `naive_classification_summary()` `naive_classification_summary_cv()` | Categorical | naiveBayes | --- ## Prediction Summary .pull-left[ ```r prediction_summary(model, data, prob_inner = 0.6, prob_outer = 0.80) ``` ``` mae mae_scaled within_60 within_80 1 3.499055 0.5628169 0.75 0.85 ``` ] -- .pull-right[ ```r prediction_summary_cv(model = model, data = data, * k = 2, prob_inner = 0.6, prob_outer = 0.80) ``` ``` $folds fold mae mae_scaled within_60 within_80 1 1 3.628639 0.5984213 0.8 0.8 2 2 3.138409 0.3751545 0.8 0.9 $cv mae mae_scaled within_60 within_80 1 3.383524 0.4867879 0.8 0.85 ``` ] --- class: middle ## `library(rstan)` .pull-left[ ```r # STEP 1: DEFINE the model stan_bike_model <- " data { int<lower=0> n; vector[n] Y; vector[n] X; } parameters { real beta0; real beta1; real<lower=0> sigma; } model { Y ~ normal(beta0 + beta1 * X, sigma); } " ``` ] .pull-right[ ```r # STEP 2: SIMULATE the posterior stan_bike_sim <- stan(model_code = stan_bike_model, data = list(n = nrow(bikes), Y = bikes$rides, X = bikes$temp_feel), chains = 4, iter = 5000*2, seed = 84735) ``` ] --- class: middle ## `library(rstanarm)` ```r normal_model_sim <- stan_glm(rides ~ temp_feel, data = bikes, family = gaussian, chains = 4, iter = 5000*2, seed = 84735) ``` --- ### `library(bayesplot)` .pull-left[ ```r mcmc_trace(normal_model_sim, size = 0.1) ``` <img src="index_files/figure-html/unnamed-chunk-32-1.png" width="80%" style="display: block; margin: auto;" /> ] .pull-right[ ```r mcmc_dens_overlay(normal_model_sim) ``` <img src="index_files/figure-html/unnamed-chunk-33-1.png" width="80%" style="display: block; margin: auto;" /> ] --- class: middle ## Resources - [Undergraduate Bayesian Education Resources](https://undergrad-bayes.netlify.app/) -- - [Undergraduate Bayesian Education Network](https://undergrad-bayes.netlify.app/network.html) -- - [STATS 115 at UC Irvine](https://www.stats115.com) --- class: middle ## SoCal Data Science project - Collaboration between UC Irvine, Cal State Fullerton, and Cypress College. - Cal State Fullerton will develop and implement a course similar to UC Irvine's Stats 115 Introduction to Bayesian Data Analysis. - Through this project our students get to work with real Bayesian and non-Bayesian projects of our academic and industry partners. HDR DSC awards: \#2123366 \#2123380 \#2123384 <img src="img/nsf-logo.png" title="NSF logo" alt="NSF logo" width="10%" style="display: block; margin: auto;" /> --- .center[ .middle[ Questions? [bit.ly/dogucu-talks](https://bit.ly/dogucu-talks) .large[
] <a href = "http://twitter.com/MineDogucu">MineDogucu</a> ] ] .footnote[<a href="https://www.freepnglogos.com">Images from freepnglogos.com</a>]