Intuitively explain the Beta Distribution and its applications.
Even though I had learned the beta distribution from UIC’s Bayesian methods course and tutored it, such as setting up it as the prior distribution in conjugate distribution context. But it was easy to forget because of its dried content and too abstract. Here I try to combine the rigid theory (UC coursework’s content) and intuitive thought. By that way, I was able to ‘permenently stamp’ the concept to my brain.
The Beta distribution is a probability distribution on/of probabilities
The beta distribution describes a family of continuous probability distributions that are nonzero only on the interval (0 1)
.
For example, we can use it to model the probabilities: the Click-Through Rate of the advertisement, the batting averages, the 5-year survival chance for women with breast cancer, and so on.
A continuous random variable \(X_B \sim Beta(\alpha, \beta)\) has Beta distribution if its probability density function (PDF) is
\[ f_{X_B} (x; \alpha, \beta) = \frac{1}{B(α,β)} x^{\alpha − 1} (1−x)^{\beta − 1}, \ \ \text{for} \ 0 < x < 1. \]
where \(B(\cdot)\) is the Beta function and shape parameters \(\alpha, \beta > 0\).
Probability as a … | ||
---|---|---|
Binomial | \(f(x) = {n \choose x} p^x (1-p)^{n-x}\) | parameter |
\(\rightarrow\) the function of \(x\) | ||
Beta | \(f(p) = \frac{1}{B(α,β)} p^{\alpha − 1} (1−p)^{\beta − 1}\) | random variable |
\(\rightarrow\) the function of \(p\) |
The beta distribution intuitively comes into play when we look at it in terms of numerator—\(x/p\) to the power of something multiplied by \(1-x/1-p\) to the power of something—from the lens of the binomial distribution.
The difference between the binomial and the beta is that the above models the number of successes (\(x\)), while the below models the probability (\(p\)) of success. In other words, the probability is a parameter
in binomial; In the Beta, the probability is a random variable
.
In this context, the shape parameters \(\alpha\) and \(\beta\) or \(\alpha-1\) as the number of successes and \(\beta-1\) as the number of failures
We can explore the beauty of beta distribution via the the calculator for Beta distribution—Dr. Bognar at the University of Iowa built it.
Beta distribution is very flexible: bell-curve (The PDF of a beta distribution is approximately normal if \(\alpha + \beta\) is large enough and \(\alpha\) & \(\beta\) are approximately equal), U-shaped (when \(\alpha\) < 1, \(\beta\) < 1) and even straight line. Here’s an graph excerpt from wikipedia.
The beta function is
\[ B(x,y) = \int_0^1 t^{x−1} (1−t)^{y−1} dt = \frac{\Gamma(x) \Gamma(y)}{\Gamma(x+y)}, \]
where \(\Gamma(\cdot)\) is the Gamma function.
The Gamma function \(\Gamma\) is an extension of the factorial function, with its argument shifted down by 1, to real and complex numbers.
For positive integer \(n\):
\[ \Gamma (n) = (n−1)! = 1 \times 2 \times 3 \times ... \times (n−1) \]
The gamma function is defined for all complex numbers except the non-positive integers by the integral:
\[ \Gamma (t) = \int_0^{\infty} x^{t-1} e^{-x} dx \]
Simplify the Beta function with the Gamma Function \(\Rightarrow\) we saw the PDF of Beta written in terms of the Gamma function. The Beta function is the ratio of the product of the Gamma function of each parameter divided by the Gamma function of the sum of the parameters (proof refered the further reading topic).
\[ E[X_B] = \mu = \frac{\alpha}{\alpha + \beta}; \ \ V[X_B] = \sigma^2 = \frac{\alpha\beta}{(\alpha + \beta)^2(\alpha + \beta + 1)} \]
The standard uniform distribution \(\text{Unif} \ (0,1)\) is a special case of the beta distribution \(Beta \ (1,1)\), when \(\alpha = \beta = 1\).
The mode is \(\omega = \frac{\alpha − 1}{\alpha + \beta − 2}\) for \(\alpha, \beta > 1\).
The concentration is \(\kappa = \alpha + \beta\).
Definitions of \(\mu, \omega\) and \(\kappa\) can be inverted:
\[ \alpha = \mu\kappa, \beta = (1 − \mu)\kappa \]
\[ \alpha = \omega(\kappa−2)+1, \beta = (1 − \omega)(\kappa−2)+1, \ \kappa > 2. \]
Parameter \(\kappa\) is a measure of number of observations needed to change our previous belief about \(\mu\).
If \(\kappa\) is small we need only a few new observations.
Example. Concentration \(\kappa = 8\) around \(\mu = 0.5\) corresponds to \(\alpha = \mu \kappa = 4\) and \(\beta = (1 − \mu) \kappa = 4\).
Parameterization in terms of mean value and standard deviation is:
\[ \alpha = \mu [\frac{\mu (1 - \mu)}{\sigma^2} - 1]; \ \ \beta = (1 - \mu)[\frac{\mu (1 - \mu)}{\sigma^2} - 1] \]
Standard deviation is typically smaller than standard deviation of uniform distribution on \([0,1]\), i.e. \(0.28867\).
Examples.
The standard uniform distribution \(Unif \ (0,1)\) is a special case of the beta distribution \(Beta \ (1,1)\), when \(\alpha = \beta = 1\).
p <- seq(0,1,by=0.2)
df <- data.frame(p)
ggplot(data=df, aes(x=p))+
stat_function(fun=dbeta, args=list(shape1=1, shape2=2), aes(colour = "alpha=1,beta=2")) +
stat_function(fun=dbeta, args=list(shape1=2, shape2=2), aes(colour = "alpha=2,beta=2")) +
stat_function(fun=dbeta, args=list(shape1=4, shape2=2), aes(colour = "alpha=4,beta=2")) +
stat_function(fun=dbeta, args=list(shape1=6, shape2=2), aes(colour = "alpha=6,beta=2")) +
stat_function(fun=dbeta, args=list(shape1=8, shape2=2), aes(colour = "alpha=8,beta=2")) +
scale_y_continuous(limits=c(0,3.6)) +
scale_colour_manual("", values = c("palegreen", "orange", "olivedrab", "blue", "black")) +
ylab("Density") +
ggtitle("PDF of Beta Distribution") +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5))
ggplot(data=df, aes(x=p))+
stat_function(fun=dbeta, args=list(shape1=2, shape2=1), aes(colour = "alpha=2,beta=1")) +
stat_function(fun=dbeta, args=list(shape1=2, shape2=2), aes(colour = "alpha=2,beta=2")) +
stat_function(fun=dbeta, args=list(shape1=2, shape2=5), aes(colour = "alpha=2,beta=5")) +
stat_function(fun=dbeta, args=list(shape1=2, shape2=6), aes(colour = "alpha=2,beta=6")) +
stat_function(fun=dbeta, args=list(shape1=2, shape2=8), aes(colour = "alpha=2,beta=8")) +
scale_y_continuous(limits=c(0,3.6)) +
scale_colour_manual("", values = c("palegreen", "orange", "olivedrab", "blue", "black")) +
ylab("Density") +
ggtitle("PDF of Beta Distribution") +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5))
ggplot(data=df, aes(x=p))+
stat_function(fun=dbeta, args=list(shape1=1, shape2=1), aes(colour = "alpha=1,beta=1")) +
scale_y_continuous(limits=c(0,3.6)) +
scale_colour_manual("", values = c("green")) +
ylab("Density") +
ggtitle("PDF of Beta Distribution") +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5))
ggplot(data=df, aes(x=p))+
stat_function(fun=dbeta, args=list(shape1=0.5, shape2=0.5), aes(colour = "alpha=0.5,beta=0.5")) +
stat_function(fun=dbeta, args=list(shape1=1, shape2=1), aes(colour = "alpha=1,beta=1")) +
stat_function(fun=dbeta, args=list(shape1=2, shape2=2), aes(colour = "alpha=2,beta=2")) +
stat_function(fun=dbeta, args=list(shape1=4, shape2=4), aes(colour = "alpha=4,beta=4")) +
stat_function(fun=dbeta, args=list(shape1=6, shape2=6), aes(colour = "alpha=6,beta=6")) +
scale_y_continuous(limits=c(0,3.6)) +
scale_colour_manual("", values = c("palegreen", "orange", "olivedrab", "blue", "black")) +
ylab("Density") +
ggtitle("PDF of Beta Distribution") +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5))
ggplot(data=df, aes(x=p))+
stat_function(fun=dbeta, args=list(shape1=400, shape2=80), aes(colour = "alpha=400,beta=80")) +
stat_function(fun=dbeta, args=list(shape1=40, shape2=8), aes(colour = "alpha=40,beta=8")) +
stat_function(fun=dbeta, args=list(shape1=30, shape2=70), aes(colour = "alpha=30,beta=70")) +
stat_function(fun=dbeta, args=list(shape1=3, shape2=7), aes(colour = "alpha=3,beta=7")) +
scale_y_continuous(limits=c(0,25)) +
scale_colour_manual("", values = c("blue", "green", "orange", "black")) +
ylab("Density") +
ggtitle("PDF of Beta Distribution") +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5))
ggplot(data=df, aes(x=p))+
stat_function(fun=dbeta, args=list(shape1=1, shape2=1), aes(colour = "alpha=1,beta=1")) +
stat_function(fun=dbeta, args=list(shape1=3, shape2=3), aes(colour = "alpha=3,beta=3")) +
stat_function(fun=dbinom, args=list(size=1, prob=0.5), aes(colour = "Bernoulli w/ prob=0.5")) + # bernoulli
scale_y_continuous(limits=c(0,3.6)) +
scale_colour_manual("", values = c("red","green","black")) +
ylab("Density") +
ggtitle("PDF of Beta Distribution") +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5))
ggplot(data=df, aes(x=p))+
stat_function(fun=dbeta, args=list(shape1=9, shape2=3), aes(colour = "alpha=9,beta=3")) +
scale_y_continuous(limits=c(0,3.4)) +
scale_colour_manual("", values = c("blue")) +
ylab("Density") +
ggtitle("PDF of Beta Distribution") +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5))
From the actions we notify that:
shiny
Planning to build an shiny
app to plot beta distribution on the specification of shape parameter (“still being in the process”).
If you see mistakes or want to suggest changes, please create an issue on the source repository.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/hai-mn/hai-mn.github.io, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Nguyen (2021, April 11). HaiBiostat: Beta Distribution: an Intuitive Explanation. Retrieved from https://hai-mn.github.io/posts/2021-04-11-beta-distribution-in-intuitive-explanation/
BibTeX citation
@misc{nguyen2021beta, author = {Nguyen, Hai}, title = {HaiBiostat: Beta Distribution: an Intuitive Explanation}, url = {https://hai-mn.github.io/posts/2021-04-11-beta-distribution-in-intuitive-explanation/}, year = {2021} }