Intuitively explain the Gamma Distribution and its applications.
Why did we invent Gamma distribution? Answer: To predict the wait time until future events. Hmmm ok, but I thought that’s what the exponential distribution is for. Then, what’s the difference between exponential distribution and gamma distribution? The exponential distribution predicts the wait time until the very first event. The gamma distribution, on the other hand, predicts the wait time until the k-th event occurs.
– Aerin Kim, Gamma Distribution – Intuition, Derivation, and Examples
Examples:
In the lecture series of Statistics 110, Lecture 24: Gamma distribution and Poisson process | Statistics 110, Prof. Joe Blitzstein had connected the \(n!\) function to the Gamma function. Why?
Let’s see the Gamma function
The Gamma function \(\Gamma\) is an extension of the factorial function, with its argument shifted down by 1, to real and complex numbers.
For positive integer \(n\): \[\Gamma(n) = (n-1)! = 1 \times 2 \times 3 \times ... \times (n-1)\]
One beautiful formula, Stirling formula to approximate the \(n!\), actually an extremely good approximation: \[ n! \approx \sqrt{2\pi n} \Big( \frac{n}{e}\Big)^n\]
n <- c(1:6)
y <- vector(mode = "numeric", length = length(n))
y[1] <- 1
for(i in 2:length(n)) {
y[i] = y[i-1] * i
}
dta <- as.data.frame(cbind(n,y))
library(ggplot2)
ggplot(dta, aes(n, y)) +
geom_point() +
scale_x_discrete(limits=c("1","2","3","4","5","6")) +
theme_bw()
Then how we connect the dots. There are many ways to do it, but there’s a philosophical way to do it by Gamma function, which is defined for all complex numbers except the non-positive integers by the integral:
\[\Gamma(t) = \int_0^{\infty} x^t e^{−x} \frac{dx}{x} \]
From the Gamma function, how we got the PDF of Gamma distribution. We would normalize the Gamma distribution, which means from:
\[ \Gamma(k) = \int_0^{\infty} x^{k} e^{−x} \frac{dx}{x} \]
to,
\[ 1 = \int_0^{\infty} \frac{1}{\Gamma(k)} x^{k} e^{−x} \frac{dx}{x} \]
Then, \(X = \frac{1}{\Gamma(k)} x^{k} e^{−x} \frac{1}{x}\) \(\sim\) \(Gamma(k, 1)\) which has shape
of \(k\) and scale
of \(1\).
How we turn the scale
of \(1\) to a general scale
of \(\theta\)?
Imagine that \(Y \sim \frac{X}{\theta}\) where \(X \sim \ Gamma(k,1)\)
\(f_Y(y) = f_X(x) \frac{dx}{dy} = \frac{1}{\Gamma(k)} (\theta y)^{k} e^{−\theta y} \frac{1}{\theta y} \theta\) where \(\frac{dx}{dy} = \theta\)
Thus, \(f(y) = \frac{1}{\Gamma(k) \theta^{k}} (y)^{k} e^{−\theta y} \frac{1}{y}\)
knitr::include_graphics("Gamma_scalevsrate_inwiki.png")
For (\(\alpha\), \(\beta\)) parameterization: Using our notation \(k\) (the # of events) & \(\lambda\) (the rate of events), simply substitute \(\alpha\) with \(k\), \(\beta\) with \(\lambda\). The PDF stays the same format as what we’ve derived.
For (\(k\), \(\theta\)) parameterization: \(\theta\) is a reciprocal of the event rate \(\lambda\), which is the mean wait time (the average time between event arrivals).
I plotted the gamma distribution with the shape
of \(k\), and constantly rate
= \(1\)
T <- seq(0,20,by=2.5)
df <- data.frame(T)
ggplot(data=df, aes(x=T))+
stat_function(fun=dgamma, args=list(shape=1, rate=1), aes(colour = "k= 1")) +
stat_function(fun=dgamma, args=list(shape=5, rate=1), aes(colour = "k= 5")) +
stat_function(fun=dgamma, args=list(shape=10, rate=1), aes(colour = "k=10")) +
scale_y_continuous(limits=c(0,0.40)) +
scale_colour_manual("", values = c("palegreen", "yellowgreen", "olivedrab")) +
ylab("Probability Density") +
ggtitle("PDF of Gamma Distribution") +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5))
I plotted the gamma distribution with the constantly shape of k = 10, and variant rate from 1 to 3.
ggplot(data=df, aes(x=T))+
stat_function(fun=dgamma, args=list(shape=10, rate=1), aes(colour = "r=1")) +
stat_function(fun=dgamma, args=list(shape=10, rate=2), aes(colour = "r=2")) +
stat_function(fun=dgamma, args=list(shape=10, rate=3), aes(colour = "r=3")) +
scale_y_continuous(limits=c(0,0.40)) +
scale_colour_manual("", values = c("gold", "burlywood", "darkorange")) +
ylab("Probability Density") +
ggtitle("PDF of Gamma Distribution (k=10)") +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5))
If you see mistakes or want to suggest changes, please create an issue on the source repository.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/hai-mn/hai-mn.github.io, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Nguyen (2021, April 11). HaiBiostat: Gamma Distribution: an Intuitive Explanation. Retrieved from https://hai-mn.github.io/posts/2021-04-11-gamma-distribution-in-intuitive-explanation/
BibTeX citation
@misc{nguyen2021gamma, author = {Nguyen, Hai}, title = {HaiBiostat: Gamma Distribution: an Intuitive Explanation}, url = {https://hai-mn.github.io/posts/2021-04-11-gamma-distribution-in-intuitive-explanation/}, year = {2021} }