Common Continuous Distributions

A Catalog of Distributions

Why study specific distribution families? Because real-world phenomena — from noise in communication channels to inter-arrival times in queueing networks — are well-modeled by a small number of canonical distributions. Each family arises from a natural structural assumption (e.g., memorylessness for the exponential, maximum entropy for the Gaussian). Knowing these families and their properties is the probabilist's toolkit.

Definition:

Uniform Distribution

A random variable XX has the uniform distribution on [a,b][a, b], written XUniform[a,b]X \sim \text{Uniform}[a, b], if its PDF is

f(x)=1ba1{x[a,b]}.f(x) = \frac{1}{b - a}\,\mathbf{1}_{\{x \in [a,b]\}}.

The CDF is F(x)=xabaF(x) = \frac{x - a}{b - a} for x[a,b]x \in [a, b]. Mean: (a+b)/2(a+b)/2. Variance: (ba)2/12(b-a)^2/12.

The uniform distribution is the simplest continuous distribution and plays a fundamental role in simulation: by the inverse transform method, any continuous distribution can be generated from Uniform[0,1]\text{Uniform}[0,1] samples.

Definition:

Exponential Distribution

A random variable XX has the exponential distribution with rate λ>0\lambda > 0, written XExp(λ)X \sim \text{Exp}(\lambda), if its PDF is

f(x)=λeλx1{x0}.f(x) = \lambda e^{-\lambda x}\,\mathbf{1}_{\{x \geq 0\}}.

CDF: F(x)=1eλxF(x) = 1 - e^{-\lambda x} for x0x \geq 0. Mean: 1/λ1/\lambda. Variance: 1/λ21/\lambda^2.

Theorem: Memoryless Property of the Exponential

A continuous random variable XX taking values in [0,)[0, \infty) is memoryless, meaning

P(X>s+tX>s)=P(X>t)for all s,t0,\mathbb{P}(X > s + t \mid X > s) = \mathbb{P}(X > t) \quad \text{for all } s, t \geq 0,

if and only if XExp(λ)X \sim \text{Exp}(\lambda) for some λ>0\lambda > 0.

Given that the event has not occurred in the first ss time units, the residual waiting time has the same distribution as the original. The exponential distribution "forgets" how long it has already waited.

Exponential as the Continuous Geometric

The memoryless property of the exponential is the continuous analog of the memoryless property of the geometric distribution. Indeed, the exponential arises as the scaling limit of the geometric: if WGeometric(λδ)W \sim \text{Geometric}(\lambda\delta) and Xδ=δWX_\delta = \delta W, then as δ0\delta \to 0, XδXExp(λ)X_\delta \to X \sim \text{Exp}(\lambda) in distribution. This connection is revisited in Chapter 11 (Poisson processes).

Definition:

Gaussian (Normal) Distribution

A random variable XX has the Gaussian (or normal) distribution with mean μ\mu and variance σ2>0\sigma^2 > 0, written XN(μ,σ2)X \sim \mathcal{N}(\mu, \sigma^2), if its PDF is

f(x)=12πσ2exp ⁣((xμ)22σ2).f(x) = \frac{1}{\sqrt{2\pi\sigma^2}}\exp\!\left(-\frac{(x - \mu)^2}{2\sigma^2}\right).

The standard normal distribution is N(0,1)\mathcal{N}(0, 1).

The Gaussian is the most important distribution in all of probability and engineering. Its privileged status stems from three independent facts: (1) it is the limit of sums of i.i.d. random variables (CLT), (2) it maximizes entropy under a second-moment constraint, and (3) it is closed under linear operations.

Definition:

Gamma Distribution

A random variable XX has the Gamma distribution with shape α>0\alpha > 0 and rate λ>0\lambda > 0, written XGamma(α,λ)X \sim \text{Gamma}(\alpha, \lambda), if its PDF is

f(x)=λαΓ(α)xα1eλx1{x0},f(x) = \frac{\lambda^\alpha}{\Gamma(\alpha)}\,x^{\alpha - 1}e^{-\lambda x}\,\mathbf{1}_{\{x \geq 0\}},

where Γ(α)=0tα1etdt\Gamma(\alpha) = \int_0^{\infty} t^{\alpha-1}e^{-t}\,dt.

Mean: α/λ\alpha/\lambda. Variance: α/λ2\alpha/\lambda^2.

Special cases: α=1\alpha = 1 yields Exp(λ)\text{Exp}(\lambda). When α=d/2\alpha = d/2 and λ=1/2\lambda = 1/2, we get the chi-squared distribution with dd degrees of freedom. The Gamma distribution is the sum of α\alpha independent Exp(λ)\text{Exp}(\lambda) variables (for integer α\alpha).

Definition:

Beta Distribution

A random variable XX has the Beta distribution with parameters a,b>0a, b > 0, written XBeta(a,b)X \sim \text{Beta}(a, b), if its PDF is

f(x)=1B(a,b)xa1(1x)b11{x[0,1]},f(x) = \frac{1}{B(a,b)}\,x^{a-1}(1-x)^{b-1}\,\mathbf{1}_{\{x \in [0,1]\}},

where B(a,b)=Γ(a)Γ(b)/Γ(a+b)B(a,b) = \Gamma(a)\Gamma(b)/\Gamma(a+b).

Mean: a/(a+b)a/(a+b). Variance: ab/[(a+b)2(a+b+1)]ab/[(a+b)^2(a+b+1)].

The Beta distribution is the natural distribution on [0,1][0,1]. It is the conjugate prior for the Bernoulli likelihood in Bayesian inference. The special case a=b=1a = b = 1 gives Uniform[0,1]\text{Uniform}[0,1].

Definition:

Student-tt Distribution

A random variable TT has the Student-tt distribution with ν\nu degrees of freedom, written TtνT \sim t_\nu, if its PDF is

f(t)=Γ ⁣(ν+12)νπΓ ⁣(ν2)(1+t2ν)(ν+1)/2.f(t) = \frac{\Gamma\!\left(\frac{\nu+1}{2}\right)}{\sqrt{\nu\pi}\,\Gamma\!\left(\frac{\nu}{2}\right)}\left(1 + \frac{t^2}{\nu}\right)^{-(\nu+1)/2}.

Mean: 00 (for ν>1\nu > 1). Variance: ν/(ν2)\nu/(\nu - 2) (for ν>2\nu > 2).

As ν\nu \to \infty, tνN(0,1)t_\nu \to \mathcal{N}(0,1). The Student-tt has heavier tails than the Gaussian, making it important in robust estimation and small-sample inference. For ν=1\nu = 1, we recover the Cauchy distribution (no finite mean).

Definition:

Rayleigh Distribution

If X,YN(0,σ2)X, Y \sim \mathcal{N}(0, \sigma^2) are independent, then R=X2+Y2R = \sqrt{X^2 + Y^2} has the Rayleigh distribution with parameter σ\sigma:

fR(r)=rσ2exp ⁣(r22σ2)1{r0}.f_R(r) = \frac{r}{\sigma^2}\exp\!\left(-\frac{r^2}{2\sigma^2}\right)\,\mathbf{1}_{\{r \geq 0\}}.

Mean: σπ/2\sigma\sqrt{\pi/2}. Variance: (2π/2)σ2(2 - \pi/2)\sigma^2.

The Rayleigh distribution models the envelope (amplitude) of a complex Gaussian signal. In wireless communications, when a transmitted signal reaches the receiver via many independent scattered paths with no line-of-sight component, the received envelope is Rayleigh-distributed.

Definition:

Ricean Distribution

If XN(μ1,σ2)X \sim \mathcal{N}(\mu_1, \sigma^2) and YN(μ2,σ2)Y \sim \mathcal{N}(\mu_2, \sigma^2) are independent with s=μ12+μ22s = \sqrt{\mu_1^2 + \mu_2^2} (the line-of-sight amplitude), then R=X2+Y2R = \sqrt{X^2 + Y^2} has the Ricean distribution with parameters ss and σ\sigma:

fR(r)=rσ2exp ⁣(r2+s22σ2)I0 ⁣(rsσ2)1{r0},f_R(r) = \frac{r}{\sigma^2}\exp\!\left(-\frac{r^2 + s^2}{2\sigma^2}\right)I_0\!\left(\frac{rs}{\sigma^2}\right)\,\mathbf{1}_{\{r \geq 0\}},

where I0I_0 is the zeroth-order modified Bessel function of the first kind. The K-factor K=s2/(2σ2)K = s^2/(2\sigma^2) quantifies the ratio of direct to scattered power. When K=0K = 0, the Ricean reduces to the Rayleigh.

Definition:

Nakagami-mm Distribution

A random variable RR has the Nakagami-mm distribution with shape m1/2m \geq 1/2 and spread Ω>0\Omega > 0 if

fR(r)=2mmΓ(m)Ωmr2m1exp ⁣(mr2Ω)1{r0}.f_R(r) = \frac{2m^m}{\Gamma(m)\,\Omega^m}\,r^{2m-1}\exp\!\left(-\frac{m r^2}{\Omega}\right)\,\mathbf{1}_{\{r \geq 0\}}.

For m=1m = 1, this is the Rayleigh distribution with Ω=2σ2\Omega = 2\sigma^2. The Nakagami-mm is more flexible than the Rayleigh or Ricean because it can model a wider range of fading severities through the single parameter mm.

Common Continuous Distributions at a Glance

DistributionPDFMeanVarianceKey Property
Uniform[a,b]\text{Uniform}[a,b]1ba\frac{1}{b-a}a+b2\frac{a+b}{2}(ba)212\frac{(b-a)^2}{12}Maximum entropy on [a,b][a,b]
Exp(λ)\text{Exp}(\lambda)λeλx\lambda e^{-\lambda x}1/λ1/\lambda1/λ21/\lambda^2Memoryless
N(μ,σ2)\mathcal{N}(\mu,\sigma^2)1σ2πe(xμ)2/(2σ2)\frac{1}{\sigma\sqrt{2\pi}}e^{-(x-\mu)^2/(2\sigma^2)}μ\muσ2\sigma^2Max entropy under 2nd moment
Gamma(α,λ)\text{Gamma}(\alpha,\lambda)λαΓ(α)xα1eλx\frac{\lambda^\alpha}{\Gamma(\alpha)}x^{\alpha-1}e^{-\lambda x}α/λ\alpha/\lambdaα/λ2\alpha/\lambda^2Sum of α\alpha exponentials
Beta(a,b)\text{Beta}(a,b)xa1(1x)b1B(a,b)\frac{x^{a-1}(1-x)^{b-1}}{B(a,b)}aa+b\frac{a}{a+b}ab(a+b)2(a+b+1)\frac{ab}{(a+b)^2(a+b+1)}Conjugate prior for Bernoulli
Rayleigh(σ\sigma)rσ2er2/(2σ2)\frac{r}{\sigma^2}e^{-r^2/(2\sigma^2)}σπ/2\sigma\sqrt{\pi/2}(2π/2)σ2(2-\pi/2)\sigma^2Envelope of complex Gaussian

Fading Distribution Comparison

Compare the Rayleigh, Ricean, and Nakagami-mm distributions — the three canonical fading models in wireless communications. Adjust the K-factor and Nakagami mm parameter to see how the distributions interpolate between severe and mild fading.

Parameters
3

Ratio of LOS to scattered power

2
1

Exponential Memoryless Property

Visualize the memoryless property: the conditional survival function P(X>s+tX>s)\mathbb{P}(X > s + t \mid X > s) always equals P(X>t)\mathbb{P}(X > t), regardless of how long we have already waited.

Parameters
1
1.5

Historical Note: The Gaussian: From Errors to Everything

1733–1900

The Gaussian distribution was discovered independently by de Moivre (1733, as an approximation to the binomial), Laplace (1774, in the context of measurement errors), and Gauss (1809, in the theory of least squares for astronomical observations). The name "normal distribution" was popularized by Karl Pearson and Francis Galton in the late 19th century, though many mathematicians and statisticians (including Gauss himself) would have objected to the implication that other distributions are "abnormal."

Common Mistake: Rate vs. Mean Parametrization of the Exponential

Mistake:

Confusing Exp(λ)\text{Exp}(\lambda) (rate parametrization, mean =1/λ= 1/\lambda) with the mean parametrization Exp(β)\text{Exp}(\beta) (mean =β= \beta, rate =1/β= 1/\beta) used in some texts.

Correction:

In this book (following Caire), we always use the rate parametrization: f(x)=λeλxf(x) = \lambda e^{-\lambda x}, mean =1/λ= 1/\lambda. When reading other sources, check which convention is in use.

Quick Check

Which distribution is the special case Gamma(1,λ)\text{Gamma}(1, \lambda)?

Uniform[0,1/λ]\text{Uniform}[0, 1/\lambda]

Exp(λ)\text{Exp}(\lambda)

N(0,1/λ)\mathcal{N}(0, 1/\lambda)

textBeta(1,lambda)\\text{Beta}(1, \\lambda)

Why This Matters: Why Fading Distributions Matter

In wireless communications, the received signal amplitude depends on the propagation environment. When many scatterers contribute with no dominant path, the central limit theorem yields a complex Gaussian signal, whose envelope is Rayleigh-distributed. A strong line-of-sight component shifts the model to Ricean. The Nakagami-mm family provides additional flexibility to fit empirical data. Every bit error rate and outage probability expression in fading channels (Book 1, Chapters 6 and 10) involves the PDF or CDF of one of these distributions.

Memoryless Property

A distribution is memoryless if P(X>s+tX>s)=P(X>t)\mathbb{P}(X > s + t \mid X > s) = \mathbb{P}(X > t). The only continuous memoryless distribution is the exponential. The only discrete memoryless distribution is the geometric.

Related: Exponential Distribution

Exponential Distribution

XExp(λ)X \sim \text{Exp}(\lambda): PDF λeλx\lambda e^{-\lambda x} for x0x \geq 0. The unique continuous memoryless distribution. Models inter-arrival times in Poisson processes.

Related: Memoryless Property