The beta-binomial model is a foundational model for Bayesian analysis. In this model, we assume the outcome follows a binomial distribution (n.b. that a Bernoulli distribution is just a special case of a binomial distribution) and that the prior distribution of \(\pi\), the probability of an event occurring, follows a beta distribution.
This type of model is useful for estimating, for example, the probability that a candidate will win an election, the proportion of people who drink beer, etc.
2.1 Beta Distribution
The beta distribution is parameterized by \(\alpha, \beta > 0\). If a random variable, X, is beta-distributed, we can notate it like so:
\(X \sim Beta(\alpha, \beta)\)
The beta distribution can approximate a Normal distribution when \(\alpha \sim \beta\) and \(\alpha\) and \(\beta\) >> 1.
The flexibility of the beta distribution can make it useful.
2.2 Binomial Distribution
The binomial distribution is used when Y is a count outcome (e.g. the number of wins in a set of matches). Proportion outcomes are just rescaled count outcomes, so this distribution applies to proportions as well.
\(Y|\pi \sim Bin(n, \pi)\)
where \(\pi\) is the probability of success in a given trial.
We can plot this as well
b =Binomial(100, 0.5) # 100 trials with π = .5x_bin =0:1:100y_bin =pdf.(b, x_bin)plot(x_bin, y_bin, label="Bin(100, .5)")
2.3 The Beta-Binomial Model
The components above are sufficient to describe our Beta-Binomial model:
Let’s say we want to predict the proportion of people who support Michelle in an election (basically the probability that she’ll win). We can simulate some data by sampling from the beta and binomial distributions.
Let’s start by setting up a prior for our values of \(\pi\). If we assume that our Beta distribution is parameterized as Beta(45, 55), we can simulate 1,000 values of \(\pi\) from this distribution.
Random.seed!(0408)α =45β =55d =Beta(α, β)n =1_000# sample 1k valuespi_sim =rand(d, n)#plot the distribution of pi valuesdensity(pi_sim)
Then, for each of these 1,000 values of \(\pi\) we’ve simulated, let’s assume we poll 100 people, and the proportion of people who support Michelle follows a Binomial distribution such that Binomial(100, \(\pi\)). We’ll draw one sample from each Binomial distribution (i.e. each value of \(\pi\)), and then we can plot the distribution of our posterior.
This will show us, roughly, how many people (out of 100) we can expect to support Michelle.
As a further step, let’s assume our data suggested that the “true” value of y is 50 – i.e. that we conducted a poll and 50 (out of 100) people suggested they’d vote for Michelle. We can see the distribution of \(\pi\) parameter values that produced these outcomes.
inds =findall(x -> x .==50, y_sim)pi_50 = pi_sim[inds]density(pi_50)
This would probably be better if we had more values in our sample, but we get the point.