The Moment Generating Function
Why Transform Methods?
We have spent several chapters learning to compute with distributions directly β PDFs, CDFs, convolutions. This works well for one or two random variables, but the moment you need the distribution of a sum of independent random variables, you face an -fold convolution that becomes unwieldy even for .
Transform methods offer an elegant alternative: encode the distribution into a single function, and the convolution becomes a product. This chapter develops three transforms β the MGF, the characteristic function, and the PGF β each suited to different contexts. Together, they form the analytical backbone that powers the great limit theorems: the law of large numbers and the central limit theorem.
Definition: Moment Generating Function (MGF)
Moment Generating Function (MGF)
Let be a random variable with CDF . The moment generating function (MGF) of is
The MGF may be finite only on a subset of . We say that the MGF exists if there is an open interval with such that for all .
The Riemann-Stieltjes formulation unifies the discrete case () and the continuous case ().
Moment Generating Function (MGF)
The function , which encodes all moments of through its Taylor coefficients: .
Related: Characteristic Function, Probability Generating Function (PGF)
MGF as Bilateral Laplace Transform
For a continuous random variable with PDF , the MGF is the bilateral Laplace transform of the density evaluated at :
This connection to the Laplace transform is why the MGF inherits all the algebraic machinery of transform calculus β in particular, the conversion of convolution to multiplication.
Theorem: Moments from Derivatives of the MGF
If for with , then all moments of exist and
Moreover, admits the Taylor expansion
for .
Differentiate under the integral: , and evaluate at to extract moments one at a time.
Interchange differentiation and expectation
Since in a neighborhood of , the dominated convergence theorem justifies:
Evaluate at $t = 0$
Setting :
Taylor expansion
Since is analytic in , it equals its Taylor series. The -th coefficient is .
Theorem: MGF of a Sum of Independent Random Variables
If and are independent random variables whose MGFs exist, then
More generally, if are independent, then
The exponential converts a sum into a product: . Independence then factors the expectation: .
Factor the exponential
.
Apply independence
Since , and , are measurable functions of independent RVs:
Key Takeaway
The MGF converts the convolution of densities into a product of functions. This is the single most important algebraic property of transforms: it reduces the problem of finding the distribution of a sum to multiplying known functions and inverting.
Example: MGF of the Gaussian Distribution
Let . Compute .
Write the expectation
$
Complete the square
The exponent is .
Evaluate the Gaussian integral
The integral over of the completed-square Gaussian density is , leaving:
Verify moments
. Also , so . Correct.
Example: MGF of the Exponential Distribution
Let with PDF for . Find and identify its domain.
Compute the integral
$
Extract moments
, so . Similarly, , giving .
MGFs of Common Distributions
| Distribution | Parameters | Domain | |
|---|---|---|---|
Common Mistake: The MGF Does Not Always Exist
Mistake:
Assuming that is finite for all β or even for any β without checking. For example, the Cauchy distribution has for all .
Correction:
Always verify the domain of finiteness before using the MGF. Heavy-tailed distributions (Cauchy, Pareto with small shape parameter, log-normal) have MGFs that diverge everywhere except at . For such distributions, use the characteristic function instead β it always exists.
MGF Explorer for Common Distributions
Select a distribution and adjust its parameters to see how the MGF changes shape. Notice how the slope at equals the mean and the curvature relates to the variance.
Parameters
Historical Note: Laplace and the Birth of Transform Methods
18th-19th centuryThe idea of encoding a function through an integral transform dates to Pierre-Simon Laplace (1749--1827), who used what we now call the Laplace transform to solve differential equations. The connection to probability was recognized early: the MGF is precisely the Laplace transform of the density evaluated on the negative real axis. Laplace himself used generating functions (the discrete analogue) to study random walks and gambler's ruin problems in his Theorie analytique des probabilites (1812).
Laplace Transform
The integral transform . For probability, the bilateral version connects to the MGF via .
Related: Moment Generating Function (MGF)
Quick Check
If and are independent, what is the MGF of ?
By the product property: . This is the MGF of .
MGF Approach to BER Analysis over Fading Channels
In digital communications over fading channels, the bit error rate (BER) conditioned on the instantaneous SNR is typically . The average BER requires integrating over the fading distribution. Using the alternative form (Craig's representation), the inner integral becomes the MGF of evaluated at . This converts the BER averaging problem into an MGF evaluation β a technique used extensively in wireless communications.
- β’
Requires that the MGF of the fading distribution exists
- β’
Craig's representation applies to the Q-function specifically
Why This Matters: MGF and Fading Channel Analysis
The MGF of the SNR distribution plays a central role in wireless communications. For Rayleigh fading (), the MGF is , which directly yields closed-form BER expressions for most modulation schemes. The MGF approach extends naturally to diversity combining (MRC, EGC) and MIMO systems where the SNR is a sum or function of channel gains.