Markov Chains and Queuing Basics
Why Markov Chains in Telecommunications?
Consider a wireless channel that alternates between a "good" state (low error rate) and a "bad" state (high error rate). The channel does not flip randomly at every symbol --- it stays in one state for a burst of symbols, then transitions. This bursty behaviour is not captured by a single random variable; it requires a model that tracks the state of the system over time.
Markov chains provide exactly this model. Their defining property --- the memoryless property --- states that the future evolution of the system depends only on the present state, not on the history of how we arrived there. This makes analysis tractable: instead of tracking an ever-growing history, we need only the current state and a matrix of transition probabilities.
Three major application areas in telecommunications are:
- Fading channel models. The Gilbert--Elliott model (two-state Markov chain) captures bursty error patterns on wireless links. Extensions to multi-state Markov models approximate Rayleigh and Rician fading with arbitrary accuracy.
- Protocol state machines. ARQ retransmission protocols, TCP congestion control, and random access protocols (ALOHA, CSMA) are naturally described as Markov chains whose stationary distributions yield throughput and delay metrics.
- Queuing systems. The M/M/1 queue --- Poisson arrivals, exponential service --- is a birth-death Markov chain. Its analysis gives closed-form expressions for delay, buffer occupancy, and packet loss, which are essential for network dimensioning.
Remarkably, computing the long-run behaviour of a Markov chain reduces to finding a left eigenvector of the transition matrix with eigenvalue . This connects directly to the eigenvalue theory of Chapter 1 and demonstrates how linear algebra underpins the analysis of dynamic stochastic systems.
Definition: Discrete-Time Markov Chain (DTMC)
Discrete-Time Markov Chain (DTMC)
A discrete-time Markov chain is a sequence of random variables taking values in a countable state space and satisfying the Markov property: for all and all states ,
In words: given the present state , the conditional distribution of the future is independent of the past .
The chain is time-homogeneous if the transition probabilities do not depend on the time index :
Throughout this section, we consider only time-homogeneous chains unless stated otherwise.
Definition: Transition Probability Matrix
Transition Probability Matrix
For a time-homogeneous DTMC on a finite state space , the transition probability matrix is the matrix with entries
The matrix has two fundamental properties:
- Non-negativity: for all .
- Row stochasticity: each row sums to one, for all .
A matrix satisfying both properties is called a (row) stochastic matrix.
The -step transition probability is given by the -entry of :
Thus, the probability distribution of the chain after steps, starting from initial distribution , is
Definition: Chapman--Kolmogorov Equations
Chapman--Kolmogorov Equations
For any non-negative integers and , the -step and -step transition probabilities satisfy the Chapman--Kolmogorov equations:
In matrix form, this becomes
The interpretation is intuitive: to go from state to state in steps, the chain must pass through some intermediate state after steps. Summing over all possible intermediate states gives the total probability.
This equation is the discrete-time analogue of the semigroup property of matrix exponentials in continuous-time systems.
Definition: Irreducibility and Aperiodicity
Irreducibility and Aperiodicity
Communication. State is accessible from state (written ) if for some . States and communicate (written ) if and . Communication is an equivalence relation and partitions into communicating classes.
Irreducibility. A Markov chain is irreducible if consists of a single communicating class, i.e., every state is accessible from every other state. Equivalently, for every pair there exists such that .
Period. The period of state is
If , the state is aperiodic; if , the state is periodic with period .
Aperiodic chain. An irreducible chain is aperiodic if every state is aperiodic. Since all states in an irreducible chain share the same period, it suffices to check a single state.
A Markov chain that is both irreducible and aperiodic is sometimes called ergodic (in the Markov chain sense). For such chains, the long-run fraction of time spent in each state converges to a unique stationary distribution.
Definition: Stationary Distribution
Stationary Distribution
A probability vector is a stationary distribution (or steady-state distribution) of the Markov chain with transition matrix if
Component-wise, this reads
Eigenvector interpretation. The stationarity condition says that is a left eigenvector of corresponding to eigenvalue . Equivalently, , so is a right eigenvector of with eigenvalue .
By the Perron--Frobenius theorem, every row-stochastic matrix has as its largest eigenvalue (in magnitude), so the stationary distribution is associated with the dominant eigenvalue --- the very eigenvector that the power method of Chapter 1 converges to.
Theorem: Existence and Uniqueness of the Stationary Distribution
Let be an irreducible, aperiodic, positive-recurrent Markov chain on a finite state space with transition matrix . Then:
-
There exists a unique stationary distribution satisfying with for all .
-
For every initial distribution, the -step transition matrix converges:
where . Equivalently, for all .
-
The stationary probability equals the long-run fraction of time the chain spends in state :
Every finite irreducible chain is positive-recurrent (the chain returns to every state in finite expected time). Aperiodicity ensures that converges to a rank-one matrix rather than cycling. The Perron--Frobenius theorem guarantees that eigenvalue is simple and all other eigenvalues satisfy , so geometrically fast.
Note: For countable (infinite) state spaces, irreducibility alone does not guarantee positive recurrence --- the chain may be null recurrent or transient. The finite-state-space assumption eliminates these subtleties.
Example: Gilbert--Elliott Channel Model
Markov Chain State Distribution Evolution
Parameters
Definition: Birth--Death Process
Birth--Death Process
A birth--death process is a Markov chain on the non-negative integers in which transitions occur only to adjacent states:
with (no deaths in state ) and all other transition probabilities equal to zero.
Balance equations. For a stationary distribution , the detailed balance equations (also called local balance) equate the probability flow across each boundary:
These are stronger than the global balance equations () but are automatically satisfied for birth--death chains because transitions skip no states.
Solving recursively:
where is determined by the normalisation .
Birth--death processes are the backbone of queuing theory: the "births" are customer arrivals and the "deaths" are service completions.
Definition: The M/M/1 Queue
The M/M/1 Queue
The M/M/1 queue is the simplest and most important single-server queuing model:
- M (Markovian arrivals): packets arrive according to a Poisson process with rate (hence exponential inter-arrival times with mean ).
- M (Markovian service): service times are i.i.d. exponential with rate (mean service time ).
- 1: a single server.
The queue length forms a continuous-time birth--death process with constant rates and for all . The discrete-time embedded chain at transition epochs is also a birth--death Markov chain.
Traffic intensity:
Stability condition: the queue has a stationary distribution if and only if (the server must be faster than the arrival rate, on average).
Stationary distribution. From the birth--death recursion with , :
This is a geometric distribution with parameter . The probability that the queue is empty is , and the probability that the queue has or more customers is .
Example: M/M/1 Queue Performance Metrics
Markov Chain State Evolution and Convergence to Stationarity
Why This Matters: Markov Models for Fading Channels and Protocols
Fading channel models. The Gilbert--Elliott model (Example above) is the simplest finite-state Markov channel (FSMC). More accurate models partition the received SNR range into regions and define transition probabilities from the level-crossing rates of the fading process:
where is the level-crossing rate at threshold , is the steady-state probability of state , and is the symbol rate. The stationary distribution of the FSMC gives the fraction of time the channel SNR falls in each region, enabling computation of average throughput under adaptive modulation and coding.
Random access protocols. Slotted ALOHA can be modeled as a Markov chain where the state is the number of backlogged users. The analysis reveals a bistable system: one stable point near zero backlog (high throughput) and another at high backlog (near zero throughput). This insight motivated the development of exponential backoff algorithms and CSMA/CA.
ARQ protocols. Stop-and-wait, Go-Back-, and selective repeat ARQ protocols are naturally modeled as Markov chains. The state tracks the sender's window position, outstanding acknowledgements, and retransmission status. The stationary distribution yields the throughput efficiency , i.e., the fraction of transmitted frames that carry new data.
TCP congestion control. The AIMD (additive increase, multiplicative decrease) mechanism of TCP can be modeled as a Markov chain, leading to the well-known TCP throughput formula:
where is the packet loss probability, RTT is the round-trip time, and is a constant.
Power Method for Computing the Stationary Distribution
Complexity: per iteration (vector-matrix product); typically converges in iterationsThis is precisely the power method from Section 1.7, applied to . Recall that the power method computes the dominant eigenvector of a matrix by repeated multiplication. Since is row-stochastic, its largest eigenvalue is (by Perron--Frobenius), and the corresponding left eigenvector is the stationary distribution .
Convergence rate. The convergence is geometric with ratio , the second-largest eigenvalue magnitude. The quantity is called the spectral gap and measures how fast the chain mixes.
Connection to Chapter 1. In Chapter 1, we used the power method to find the dominant eigenvector of symmetric matrices (e.g., for PCA or dominant mode analysis). Here the matrix is generally not symmetric --- it is stochastic --- but the same iterative principle applies. The key insight is that the stationary distribution of a Markov chain is fundamentally an eigenvalue problem.
Alternatives. For very large sparse chains, direct methods (Gaussian elimination on plus the normalisation constraint) or iterative solvers (GMRES, Gauss--Seidel) may be more efficient.
Quick Check
Consider the two-state Markov chain with transition matrix
What is the stationary probability (the long-run fraction of time spent in state 1)?
Correct. From the balance equation and , we get . Wait --- let us be careful with the indexing. If the states are labelled and , then and , so .
Actually, with states labelled and (as per the matrix rows), and , so .
Quick Check
An M/M/1 queue has arrival rate packets/s and service rate packets/s. What is the mean number of packets in the system ?
Correct. The traffic intensity is , so packets.
Quick Check
Which of the following is required for a finite Markov chain to have a unique stationary distribution?
The chain must be irreducible
The chain must be aperiodic
The transition matrix must be symmetric
All entries of must be strictly positive
Correct. Irreducibility (all states communicate) guarantees uniqueness of the stationary distribution for finite chains. Aperiodicity is additionally needed for convergence of to the stationary distribution, but uniqueness holds even for periodic irreducible chains.
Markov Chain
A stochastic process (discrete-time) or (continuous-time) satisfying the Markov (memoryless) property: conditioned on the present state, the future is independent of the past. Markov chains are the fundamental building block of queuing theory, protocol analysis, and finite-state channel models.
Transition Matrix
A row-stochastic matrix whose -entry gives the one-step transition probability from state to state . The -step transition probabilities are given by .
Stationary Distribution
A probability vector satisfying . It is the left eigenvector of associated with eigenvalue . For an irreducible chain, the stationary distribution is unique and represents the long-run fraction of time spent in each state.
Irreducible
A Markov chain is irreducible if every state can be reached from every other state in a finite number of steps. Equivalently, the state space consists of a single communicating class. Irreducibility guarantees uniqueness of the stationary distribution (for finite chains).
M/M/1 Queue
A single-server queuing model with Poisson arrivals (rate ) and exponential service times (rate ). The stationary queue length distribution is geometric: where . Key metrics: mean queue length , mean delay .
Common Mistake: Forgetting to Check for M/M/1 Stability
Mistake:
Applying the M/M/1 formulas and without first verifying that .
When (i.e., ), the queue has no stationary distribution: the queue length grows without bound, the mean delay is infinite, and the formulas produce negative or nonsensical values.
This error is common in homework and can also arise in system design when traffic spikes push the offered load above the service capacity.
Correction:
Always check stability first. Before computing any M/M/1 metric, verify:
If , the system is unstable and no steady-state analysis applies. In practice, if is close to (e.g., ), the delay and queue length are extremely sensitive to small changes in or --- the system operates near the "knee" of the delay curve.
For , one must either increase the service rate (faster link, more efficient coding), decrease the arrival rate (admission control, traffic shaping), or add more servers (M/M/ queue).
Key Takeaway
The central message of this section in three points:
-
Markov chains reduce dynamics to matrices. A complex system with many interacting states --- fading channels, protocol machines, queuing networks --- is fully characterised by its transition matrix . All temporal behaviour is encoded in the powers .
-
The stationary distribution is an eigenvector. The long-run behaviour is a left eigenvector equation with eigenvalue . Computing is exactly the eigenvalue problem from Chapter 1 applied to a stochastic matrix. The power method, QR algorithm, and spectral decomposition all carry over directly.
-
Queuing theory = birth--death Markov chains. The M/M/1 queue produces closed-form delay and throughput formulas that every network engineer uses daily. The key parameter is the traffic intensity : as , delay blows up as --- a fundamental limit that no clever protocol can circumvent.
Markov chains are thus a bridge between the linear algebra of Chapter 1 and the probability theory of Chapter 2, unified by the observation that steady-state analysis of dynamic systems is an eigenvalue problem.