Balance and Sequentiality in Bayesian Analyses

Dr. Mine Dogucu

The notes for this lecture are derived from Chapter 4 of the Bayes Rules! book

Balance in Bayesian Analyses

Data Context

Bechdel Test

Alison Bechdel’s 1985 comic Dykes to Watch Out For has a strip called The Rule where a person states that they only go to a movie if it satisfies the following three rules: - the movie has to have at least two women in it; - these two women talk to each other; and - they talk about something besides a man.

This test is used for assessing movies in terms of representation of women. Even though there are three criteria, a movie either fails or passes the Bechdel test.

Different Priors, Different Posteriors

Let \(\pi\) be the the proportion of movies that pass the Bechdel test.

Below there are three different people with three different priors about \(\pi\).

optimist	clueless	feminist
Beta(14,1)	Beta(1,1)	Beta(5,11)

Plot their priors.

Priors

Vocabulary

Informative prior: An informative prior reflects specific information about the unknown variable with high certainty (ie. low variability).

Vague (diffuse) prior:

A vague or diffuse prior reflects little specific information about the unknown variable. A flat prior, which assigns equal prior plausibility to all possible values of the variable, is a special case.

Data

library(bayesrules) has bechdel data frame. Randomly select 20 movies from this dataset (seed = 84735).
Based on observed data, update the posterior for all three people.
Calculate the summary statistics for the prior and the posterior for all three.
Plot the prior, likelihood, and the posterior for all three.
Explain the effect of different priors on the posterior.

library(tidyverse)
library(bayesrules)
set.seed(84735)

bechdel_sample <- sample_n(bechdel, 20)

count(bechdel_sample, binary)

# A tibble: 2 × 2
  binary     n
  <chr>  <int>
1 FAIL      11
2 PASS       9

The Optimist

summarize_beta_binomial(14, 1, y = 9, n = 20)

      model alpha beta      mean      mode         var         sd
1     prior    14    1 0.9333333 1.0000000 0.003888889 0.06236096
2 posterior    23   12 0.6571429 0.6666667 0.006258503 0.07911070

The Optimist

plot_beta_binomial(14, 1, y = 9, n = 20)

The Clueless

summarize_beta_binomial(1, 1, y = 9, n = 20)

      model alpha beta      mean mode        var        sd
1     prior     1    1 0.5000000  NaN 0.08333333 0.2886751
2 posterior    10   12 0.4545455 0.45 0.01077973 0.1038255

The Clueless

plot_beta_binomial(1, 1, y = 9, n = 20)

The Feminist

summarize_beta_binomial(5, 11, y = 9, n = 20)

      model alpha beta      mean      mode        var         sd
1     prior     5   11 0.3125000 0.2857143 0.01263787 0.11241827
2 posterior    14   22 0.3888889 0.3823529 0.00642309 0.08014418

The Feminist

plot_beta_binomial(5, 11, y = 9, n = 20)

Comparison

Different Data, Different Posteriors

Morteza, Nadide, and Ursula – all share the optimistic Beta(14,1) prior for \(\pi\) but each have access to different data. Morteza reviews movies from 1991. Nadide reviews movies from 2000 and Ursula reviews movies from 2013. How will the posterior distribution for each differ?

Morteza’s analysis

bechdel_1991 <- filter(bechdel, year == 1991)
count(bechdel_1991, binary)

# A tibble: 2 × 2
  binary     n
  <chr>  <int>
1 FAIL       7
2 PASS       6

6/13

[1] 0.4615385

Morteza’s analysis

plot_beta_binomial(14, 1, y = 6, n = 13)

Nadide’s analysis

bechdel_2000 <- filter(bechdel, year == 2000)
count(bechdel_2000, binary)

# A tibble: 2 × 2
  binary     n
  <chr>  <int>
1 FAIL      34
2 PASS      29

29/(34+29)

[1] 0.4603175

Nadide’s analysis

plot_beta_binomial(14, 1, y = 29, n = 63)

Ursula’s analysis

bechdel_2013 <- filter(bechdel, year == 2013)
count(bechdel_2013, binary)

# A tibble: 2 × 2
  binary     n
  <chr>  <int>
1 FAIL      53
2 PASS      46

46/(53+46)

[1] 0.4646465

Ursula’s analysis

plot_beta_binomial(14, 1, y = 46, n = 99)

Summary

Sequentiality in Bayesian Analyses

Sequential Analysis

In a sequential Bayesian analysis, a posterior model is updated incrementally as more data comes in. With the introduction of each new piece of data, the previous posterior model reflecting our understanding prior to observing this data becomes the new prior model.

Let’s time travel to the end of 1970

\(\pi \sim Beta(14,1)\)

bechdel |> 
  filter(year == 1970)

# A tibble: 1 × 3
   year title                          binary
  <dbl> <chr>                          <chr> 
1  1970 Beyond the Valley of the Dolls PASS

The Posterior

summarize_beta_binomial(14, 1, y = 1, n = 1)

      model alpha beta      mean mode         var         sd
1     prior    14    1 0.9333333    1 0.003888889 0.06236096
2 posterior    15    1 0.9375000    1 0.003446691 0.05870853

At the end of 1971

\(\pi \sim Beta(15,1)\)

bechdel |> 
  filter(year == 1971)

# A tibble: 5 × 3
   year title                                   binary
  <dbl> <chr>                                   <chr> 
1  1971 Escape from the Planet of the Apes      FAIL  
2  1971 Shaft                                   FAIL  
3  1971 Straw Dogs                              FAIL  
4  1971 The French Connection                   FAIL  
5  1971 Willy Wonka &amp; the Chocolate Factory FAIL

The Posterior

summarize_beta_binomial(15, 1, y = 0, n = 5)

      model alpha beta      mean      mode         var         sd
1     prior    15    1 0.9375000 1.0000000 0.003446691 0.05870853
2 posterior    15    6 0.7142857 0.7368421 0.009276438 0.09631427

At the end of 1972

\(\pi \sim Beta(15,6)\)

bechdel |> 
  filter(year == 1972)

# A tibble: 3 × 3
   year title          binary
  <dbl> <chr>          <chr> 
1  1972 1776           FAIL  
2  1972 Pink Flamingos PASS  
3  1972 The Godfather  FAIL

The Posterior

summarize_beta_binomial(15, 6, y = 1, n = 3)

      model alpha beta      mean      mode         var         sd
1     prior    15    6 0.7142857 0.7368421 0.009276438 0.09631427
2 posterior    16    8 0.6666667 0.6818182 0.008888889 0.09428090

Time	Data	Model
before the analysis	NA	Beta(14,1)
at the end of 1970	Y = 1, n = 1	Beta(15,1)
at the end of 1971	Y = 0, n = 5	Beta(15, 6)
at the end of 1972	Y = 1, n = 3	Beta(16,8)

Data Order Invariance

Time	Data	Model
before the analysis	NA	Beta(14,1)
1972	Y = 1, n = 3	Beta(15,3)
1971	Y = 0, n = 5	Beta(15,8)
1970	Y = 1, n = 1	Beta(16,8)

What if we observed all the data at once?

Time	Data	Model
before the analysis	NA	Beta(14,1)
1970	Y = 1, n = 1
1971	Y = 0, n = 5
1972	Y = 1, n = 3
Total	Y = 2, n = 9

summarize_beta_binomial(14, 1, y = 2, n = 9)

      model alpha beta      mean      mode         var         sd
1     prior    14    1 0.9333333 1.0000000 0.003888889 0.06236096
2 posterior    16    8 0.6666667 0.6818182 0.008888889 0.09428090

Let \(\theta\) be any parameter of interest with prior pdf \(f(\theta)\). Then a sequential analysis in which we first observe a data point \(y_1\) and then a second data point \(y_2\) will produce the same posterior model of \(\theta\) as if we first observe \(y_2\) and then \(y_1\):

\[f(\theta | y_1,y_2) = f(\theta|y_2,y_1)\;.\]

Similarly, the posterior model is invariant to whether we observe the data all at once or sequentially.