Ferkans — Interactive Telecom Tutor

Why Reliability Theory?

Every physical system eventually fails. The engineering question is not whether failure will occur, but when and how likely. Reliability theory gives us a probability model for this question: each component operates correctly with probability $R_i$ and fails with probability $1 - R_i$ , independently of all other components. The system is reliable if and only if a sufficient subset of components is operational.

In wireless communications, this model applies to redundant base station deployments, multi-path routing in mesh networks, and redundant power amplifiers in phased-array transmitters. Understanding how component-level reliabilities combine into system-level reliability is an indispensable engineering skill.

Definition:
Component Reliability

A component is a binary device: it is either working (event $A_i$ ) or failed (event $F_i = A_i^c$ ). The reliability of component $i$ is $R_i \;=\; \mathbb{P}(A_i) \;\in\; [0, 1].$ Components are assumed independent: knowledge that component $j$ has failed gives no information about whether component $i$ has failed.

Independence is a modeling assumption, not a physical law. Components powered by the same power bus are not independent — their failures are positively correlated. In that case the formulas below give overly optimistic reliability estimates.

Reliability

The probability that a component or system performs its intended function over a specified period under specified operating conditions. Written $R_i$ for component $i$ and $R_s$ for a system.

Definition:
Series System

A series system of $n$ components works if and only if all components work: $R_s^{\text{series}} \;=\; \mathbb{P}(A_1 \cap A_2 \cap \cdots \cap A_n) \;=\; \prod_{i=1}^n R_i.$ Since $R_i \leq 1$ , the product is at most $\min_i R_i$ , and adding components to a series system can only decrease reliability.

A series system models scenarios where every single component is critical: a single break in a power line, a failed decoder stage, or a missing link in a store-and-forward relay chain.

Series System

A system that works if and only if all $n$ components work. Reliability: $R_s = \prod_{i=1}^n R_i$ .

Related: Reliability, Parallel System

Definition:
Parallel System

A parallel system of $n$ components works if and only if at least one component works: $R_s^{\text{parallel}} \;=\; \mathbb{P}(A_1 \cup A_2 \cup \cdots \cup A_n) \;=\; 1 - \prod_{i=1}^n (1 - R_i).$ Adding components to a parallel system can only increase reliability.

The derivation uses independence and De Morgan's law: $\mathbb{P}(A_1 \cup \cdots \cup A_n) = 1 - \mathbb{P}(F_1 \cap \cdots \cap F_n) = 1 - \prod (1-R_i)$ .

Parallel System

A system that works if and only if at least one of its $n$ components works. Reliability: $R_s = 1 - \prod_{i=1}^n (1 - R_i)$ .

Related: Reliability, Series System

Example: Series vs. Parallel: A Numerical Comparison

You have $n = 5$ components each with reliability $R_i = 0.9$ . Compute the system reliability when the components are connected (a) in series and (b) in parallel.

Solution

Series system

$R_s^{\text{series}} = R_i^n = 0.9^5 = 0.59049.$ $ Even with 90%-reliable components, a 5-stage series system is barely 59% reliable — every additional stage degrades the system.

Parallel system

$R_s^{\text{parallel}} = 1 - (1-R_i)^n = 1 - 0.1^5 = 1 - 10^{-5} = 0.99999.$ $Five-way redundancy raises the failure probability from 10% per component down to$ 10^{-5}$ at the system level.

Takeaway

Series connections aggregate failure probabilities multiplicatively — the system is less reliable than any component. Parallel connections aggregate failure probabilities multiplicatively in the failure domain — the system failure probability shrinks exponentially with redundancy.

Series vs. Parallel System Reliability

Explore how system reliability depends on component count $n$ and individual reliability $p$ for both series and parallel configurations. Observe that parallel systems improve dramatically with redundancy while series systems deteriorate.

Parameters

Component reliability

p

0.9

Maximum

n

(components)10

Theorem: Inclusion-Exclusion for System Reliability

Let $A_1, \ldots, A_n$ be the events that components $1, \ldots, n$ are working. For any system whose working condition is captured by $A = A_1 \cup A_2 \cup \cdots \cup A_n$ , the probability satisfies $\mathbb{P}(A) = \sum_{k=1}^n (-1)^{k-1} \sum_{1 \leq i_1 < \cdots < i_k \leq n} \mathbb{P}(A_{i_1} \cap \cdots \cap A_{i_k}).$ Under independence, $\mathbb{P}(A_{i_1} \cap \cdots \cap A_{i_k}) = R_{i_1} \cdots R_{i_k}$ , and the formula simplifies to $\mathbb{P}(A) = \sum_{k=1}^n (-1)^{k-1} \sum_{1 \leq i_1 < \cdots < i_k \leq n} \prod_{j=1}^k R_{i_j}.$

Direct counting over-counts outcomes where multiple components work. Inclusion-exclusion corrects by alternately adding and subtracting the higher-order intersection terms. Each term accounts for all $\binom{n}{k}$ subsets of size $k$ .

Show Hint

For two components, verify directly: $\mathbb{P}(A_1 \cup A_2) = P(A_1) + P(A_2) - P(A_1 \cap A_2)$ .

For three components, identify all $2^3 - 1 = 7$ terms and which get $+$ vs. $-$ signs.

The general formula follows from the indicator identity $\mathbf{1}_{A_1 \cup \cdots \cup A_n} = 1 - \prod_{i=1}^n (1 - \mathbf{1}_{A_i})$ , then taking expectations.

Proof

Indicator identity

For any events $A_1, \ldots, A_n$ , the indicator of their union satisfies $\mathbf{1}_{A_1 \cup \cdots \cup A_n} = 1 - \prod_{i=1}^n (1 - \mathbf{1}_{A_i}).$ This is a deterministic identity: both sides equal 1 if at least one $A_i$ occurs, and 0 if none occur.

Expand the product

Expanding the product $\prod_{i=1}^n (1 - \mathbf{1}_{A_i})$ using the binomial theorem over all subsets $S \subseteq \{1, \ldots, n\}$ : $\prod_{i=1}^n (1 - \mathbf{1}_{A_i}) = \sum_{S \subseteq \{1,\ldots,n\}} (-1)^{|S|} \prod_{i \in S} \mathbf{1}_{A_i} = \sum_{S \subseteq \{1,\ldots,n\}} (-1)^{|S|} \mathbf{1}_{\bigcap_{i \in S} A_i}.$

Take expectations

Applying $\mathbb{E}[\cdot] = \mathbb{P}(\cdot)$ and using linearity: $\mathbb{P}(A_1 \cup \cdots \cup A_n) = 1 - \sum_{S \neq \emptyset} (-1)^{|S|} \mathbb{P} \!\Bigl(\bigcap_{i \in S} A_i\Bigr) = \sum_{k=1}^n (-1)^{k-1} \sum_{|S|=k} \mathbb{P}\!\Bigl(\bigcap_{i \in S} A_i\Bigr).$ $\blacksquare$

Key Takeaway

Inclusion-exclusion converts a union-of-events probability into a signed sum of intersection probabilities, which under independence factor into products of component reliabilities. This is the workhorse for analyzing complex networks that are neither purely series nor purely parallel.

Definition:
Bridge Network

A bridge network (also called a Wheatstone bridge topology) has five components arranged so that no simple series-parallel reduction applies. Labeling the components 1–5 where component 5 is the "bridge" link, the network works if and only if at least one of the following minimal path sets is operational: $\{1,4\}, \quad \{2,3\}, \quad \{1,5,3\}, \quad \{2,5,4\}.$ The system reliability requires inclusion-exclusion over these four paths.

The bridge network is the canonical example demonstrating that inclusion-exclusion is necessary — no series-parallel simplification can reduce it. It appears in relay networks and multi-hop wireless routing problems.

Example: Bridge Network Reliability via Inclusion-Exclusion

Compute the reliability of the bridge network with five independent components each having reliability $p$ . Apply inclusion-exclusion over the four minimal path sets $P_1 = \{1,4\}$ , $P_2 = \{2,3\}$ , $P_3 = \{1,5,3\}$ , $P_4 = \{2,5,4\}$ .

Solution

Probabilities of individual path sets working

Let $B_j$ be the event that all components in path $P_j$ work. $\mathbb{P}(B_1) = p^2, \quad \mathbb{P}(B_2) = p^2, \quad \mathbb{P}(B_3) = p^3, \quad \mathbb{P}(B_4) = p^3.$

Pairwise intersections

The system works iff $B_1 \cup B_2 \cup B_3 \cup B_4$ occurs. Pairwise intersections (all components in both paths must work): $\mathbb{P}(B_1 \cap B_2) = p^4, \quad \mathbb{P}(B_1 \cap B_3) = p^4, \quad \mathbb{P}(B_1 \cap B_4) = p^5,$ $\mathbb{P}(B_2 \cap B_3) = p^5, \quad \mathbb{P}(B_2 \cap B_4) = p^4, \quad \mathbb{P}(B_3 \cap B_4) = p^5.$

Triple and quadruple intersections

$\mathbb{P}(B_1 \cap B_2 \cap B_3) = p^5,\quad \mathbb{P}(B_1 \cap B_2 \cap B_4) = p^5,KATEXPLACEHOLDER0END\mathbb{P}(B_1 \cap B_3 \cap B_4) = p^5,\quad \mathbb{P}(B_2 \cap B_3 \cap B_4) = p^5,KATEXPLACEHOLDER1END\mathbb{P}(B_1 \cap B_2 \cap B_3 \cap B_4) = p^5.$ $

Apply inclusion-exclusion

$R_s = (2p^2 + 2p^3) - (3p^4 + p^4 + p^5 + p^5) + (4p^5) - p^5KATEXPLACEHOLDER0END= 2p^2 + 2p^3 - 5p^4 + 2p^5.$ $Verify:$ R_s(1) = 2+2-5+2 = 1 $and$ R_s(0) = 0 $.$ \blacksquare$

Bridge Network Reliability

The bridge network $R_s = 2p^2 + 2p^3 - 5p^4 + 2p^5$ computed via inclusion-exclusion. Compare with the lower bounds from individual path sets and upper bounds from union bound. The bridge link (component 5) can be toggled off (setting $R_5 = 0$ ) to see what happens when the cross-link fails.

Parameters

Component reliability

p

(components 1–4)0.9

Bridge link reliability

p_5

0.9

Historical Note: Origins of Reliability Theory

1950s–1965

Modern reliability theory emerged from the post-World War II U.S. military effort to improve the dependability of complex electronics. The 1950s saw the failure rate of airborne electronics during missions rise alarmingly; the U.S. Department of Defense commissioned a study (1957) that established the field. Richard Barlow and Frank Proschan's 1965 textbook Mathematical Theory of Reliability provided the rigorous probabilistic foundations, including the concept of coherent systems and the role of inclusion-exclusion in structural function analysis.

Definition:
Coherent System

A system is coherent if:

Replacing a failed component by a working one can never cause a working system to fail (monotone structure function).
Every component is relevant — there exist states of other components such that changing component $i$ from failed to working changes the system state.

Formally, the structure function $\phi: \{0,1\}^n \to \{0,1\}$ (where $\phi(\mathbf{x}) = 1$ means the system works given component state vector $\mathbf{x}$ ) must be non-decreasing in each argument. Series and parallel systems are both coherent; the bridge network is coherent.

Coherence rules out pathological systems where adding a redundant component can somehow cause failure. Every physically meaningful system design should be coherent.

Theorem: Bonferroni Bounds for System Reliability

For a coherent system with $n$ components and independent reliabilities $R_1, \ldots, R_n$ , let $S_k = \sum_{|J|=k} \prod_{i \in J} R_i$ be the $k$ -th elementary symmetric polynomial. The inclusion-exclusion partial sums alternate around the true system reliability: $\sum_{k=1}^{2m-1}(-1)^{k-1} S_k \;\leq\; R_s \;\leq\; \sum_{k=1}^{2m}(-1)^{k-1} S_k, \quad m = 1, 2, \ldots$ In particular, the union bound (first-order approximation) gives $R_s \leq \sum_{i=1}^n R_i$ and the first-order lower bound is $R_s \geq \sum_{i=1}^n R_i - \sum_{i<j} R_i R_j$ .

The Bonferroni inequalities are the truncations of inclusion-exclusion. Stopping at an odd term gives a lower bound; stopping at an even term gives an upper bound. This is useful when exact computation is expensive but two-sided bounds suffice.

Show Hint

Consider the indicator identity for $\mathbf{1}_{A_1 \cup \cdots \cup A_n}$ and truncate the polynomial expansion.

The truncation error at order $k$ has the same sign as $(-1)^k$ .

Proof

Indicator expansion

From the proof of the inclusion-exclusion theorem, $\mathbf{1}_{A_1 \cup \cdots \cup A_n} = \sum_{k=1}^n (-1)^{k-1} \sum_{|S|=k} \mathbf{1}_{\bigcap_{i \in S} A_i}.$ This is an exact equality of random variables.

Truncation and sign of error

Truncating at order $K$ , the remainder has the same sign as the next ( $K+1$ )-th term. For $K$ odd, the next term has sign $(-1)^K = -1$ , so the truncation overestimates — a lower bound. For $K$ even the truncation underestimates — an upper bound.

Take expectations

Taking $\mathbb{E}[\cdot]$ preserves the inequalities and converts indicators into probabilities: $\sum_{k=1}^{K}(-1)^{k-1} S_k \;\begin{cases}\leq R_s & K \text{ odd}\\\geq R_s & K \text{ even.}\end{cases}$ $\blacksquare$

Why This Matters: Wireless Network Availability and Diversity

In a multi-hop wireless relay network with $L$ relay nodes on the path from source to destination, successful delivery requires all hops to succeed — a series system. If each hop succeeds with probability $p$ (determined by the fading margin and coding scheme), the end-to-end success probability is $p^L$ , degrading rapidly with path length.

Spatial diversity combats this: a network with $D$ independent paths (frequency diversity, spatial diversity from multiple antennas, or route diversity) acts as a parallel system. End-to-end failure probability drops to $(1-p^L)^D$ . At high SNR where $p \approx 1 - \epsilon$ , diversity order $D$ suppresses the outage probability from $O(\epsilon^L)$ to $O(\epsilon^{LD})$ .

Quick Check

Three independent components each have reliability $p = 0.8$ . You connect them in a 2-out-of-3 majority system: the system works if at least 2 of the 3 components work. What is the system reliability?

$3 \times 0.8^2 \times 0.2 + 0.8^3$

$0.8^3 = 0.512$

$1 - 0.2^3 = 0.992$

$1 - 3 \times 0.2^2 \times 0.8$

Correction:

3 \times 0.8^2 \times 0.2 + 0.8^3

The system works when exactly 2 components work (probability $3 \times 0.8^2 \times 0.2 = 0.384$ ) or all 3 work ( $0.8^3 = 0.512$ ). Total: $0.896$ .

⚠️Engineering Note

Redundancy vs. Cost Trade-off in Wireless Systems

Parallel redundancy (adding backup components) increases reliability at the cost of additional hardware, power, and management overhead. In 5G base stations, the 3GPP standard requires 99.999% availability (five 9s) over the air interface. Achieving this with 99%-reliable power amplifiers requires $\lceil \log(10^{-5})/\log(0.01) \rceil = 3$ parallel amplifiers (each failure probability $0.01$ ; three in parallel gives $0.01^3 = 10^{-6}$ failure probability, exceeding the five-nines requirement).

Practical Constraints

•
3GPP TS 22.261 mandates 99.999% availability for Ultra-Reliable Low Latency Communication (URLLC)
•
Each additional parallel unit roughly doubles the hardware and power cost
•
Active standby (hot spare) achieves faster failover than passive standby at higher steady-state cost

📋 Ref: 3GPP TS 22.261, Table 7.2.1-1

Common Mistake: Confusing Series and Parallel Reliability Formulas

Mistake:

A common mistake is to apply the parallel formula $R_s = 1 - \prod(1 - R_i)$ to a series system or vice versa. The two formulas look structurally similar and are easily swapped when working quickly.

Correction:

Remember the logic: series = AND (all must work), parallel = OR (at least one must work). The series formula $R_s = \prod R_i$ is the probability of an intersection; the parallel formula $R_s = 1 - \prod(1-R_i)$ uses De Morgan to convert a union to a complement of an intersection. Always check: does the system require every component, or just one?

Series vs. Parallel System Properties

Property	Series System	Parallel System
Logic	ALL components must work (AND)	AT LEAST ONE must work (OR)
Reliability formula	$R_s = \prod_{i=1}^n R_i$	$R_s = 1 - \prod_{i=1}^n(1-R_i)$
Effect of adding components	Decreases $R_s$	Increases $R_s$
Bottleneck	Weakest component dominates	Strongest component dominates
Wireless analog	Multi-hop relay chain	Spatial diversity combining
Asymptotic $n\to\infty$	$R_s \to 0$ (even if $R_i > 0$ )	$R_s \to 1$ (even if $R_i < 1$ )

Series and Parallel System Block Diagrams — Left: series system (components 1 through $n$ in chain — all must succeed for signal flow). Right: parallel system ( $n$ parallel paths — any one suffices). The bridge network (center) cannot be reduced to either form.

Series/Parallel Reliability: Block Diagram Animation

An animated walkthrough showing how a system fails under random component failures. Components light up green (working) or red (failed) in real time, and the system-level status updates as each component state changes.

Each component fails with probability

1 - R_i = 0.1

. The series system fails as soon as any one component fails; the parallel system survives until the last path is cut.

Common Mistake: Independence Is an Assumption, Not a Fact

Mistake:

Applying the independence formula $\mathbb{P}(A_1 \cap \cdots \cap A_n) = \prod R_i$ when the components are actually correlated leads to systematically optimistic reliability estimates.

Correction:

In practice, components may share a power supply, a common mode of failure (e.g., an earthquake), or be manufactured by the same defective production batch. These common-cause failures violate independence. The correct analysis uses the law of total probability: condition on whether the common-cause event occurs.

System Reliability

Why Reliability Theory?

Definition: Component Reliability

Reliability

Definition: Series System

Series System

Definition: Parallel System

Parallel System

Example: Series vs. Parallel: A Numerical Comparison

Series system

Parallel system

Takeaway

Series vs. Parallel System Reliability

Parameters

Theorem: Inclusion-Exclusion for System Reliability

Indicator identity

Expand the product

Take expectations

Key Takeaway

Definition: Bridge Network

Example: Bridge Network Reliability via Inclusion-Exclusion

Probabilities of individual path sets working

Pairwise intersections

Triple and quadruple intersections

Apply inclusion-exclusion

Bridge Network Reliability

Parameters

Historical Note: Origins of Reliability Theory

Definition: Coherent System

Theorem: Bonferroni Bounds for System Reliability

Indicator expansion

Truncation and sign of error

Take expectations

Why This Matters: Wireless Network Availability and Diversity

Quick Check

Redundancy vs. Cost Trade-off in Wireless Systems

Common Mistake: Confusing Series and Parallel Reliability Formulas

Series vs. Parallel System Properties

Series and Parallel System Block Diagrams

Series/Parallel Reliability: Block Diagram Animation

Common Mistake: Independence Is an Assumption, Not a Fact

Definition:
Component Reliability

Definition:
Series System

Definition:
Parallel System

Definition:
Bridge Network

Definition:
Coherent System