Variational Autoencoder (VAE)
Definition: Variational Autoencoder
Variational Autoencoder
A VAE consists of:
- Encoder : maps input to latent distribution
- Decoder : generates from latent
- Loss (ELBO):
The reparameterisation trick: , .
Definition: KL Divergence for Gaussians
KL Divergence for Gaussians
For and :
Definition: Reparameterisation Trick
Reparameterisation Trick
Instead of sampling , sample and compute . This makes the sampling differentiable with respect to .
Definition: Evidence Lower Bound (ELBO)
Evidence Lower Bound (ELBO)
The ELBO is a lower bound on the log-likelihood:
The first term is reconstruction quality; the second is latent regularisation.
Definition: Beta-VAE
Beta-VAE
Multiply the KL term by to encourage more disentangled latent representations:
Theorem: ELBO Derivation
Starting from Jensen's inequality applied to :
The gap equals .
Maximising the ELBO simultaneously improves reconstruction and makes the approximate posterior closer to the true posterior.
Example: Implementing a VAE
Build a VAE for 28x28 grayscale images.
Implementation
class VAE(nn.Module):
def __init__(self, latent_dim=16):
super().__init__()
self.encoder = nn.Sequential(nn.Flatten(),
nn.Linear(784, 256), nn.ReLU(), nn.Linear(256, 128), nn.ReLU())
self.fc_mu = nn.Linear(128, latent_dim)
self.fc_logvar = nn.Linear(128, latent_dim)
self.decoder = nn.Sequential(
nn.Linear(latent_dim, 128), nn.ReLU(),
nn.Linear(128, 256), nn.ReLU(), nn.Linear(256, 784), nn.Sigmoid())
def encode(self, x):
h = self.encoder(x)
return self.fc_mu(h), self.fc_logvar(h)
def reparameterize(self, mu, logvar):
std = torch.exp(0.5 * logvar)
return mu + std * torch.randn_like(std)
def forward(self, x):
mu, logvar = self.encode(x)
z = self.reparameterize(mu, logvar)
return self.decoder(z).view_as(x), mu, logvar
Example: VAE Loss Function
Implement the ELBO loss.
Implementation
def vae_loss(x_recon, x, mu, logvar):
recon = F.binary_cross_entropy(x_recon, x, reduction='sum')
kl = -0.5 * torch.sum(1 + logvar - mu.pow(2) - logvar.exp())
return recon + kl
Example: Latent Space Interpolation
Interpolate between two images in the latent space.
Approach
Encode both images to get . Generate for and decode each .
VAE Latent Space Explorer
Explore the 2D latent space of a trained VAE.
Parameters
KL vs Reconstruction Trade-off
See how beta affects the KL-reconstruction balance.
Parameters
Generative Model Taxonomy
VAE Architecture
Quick Check
Why is the reparameterisation trick needed in VAEs?
To make sampling faster
To make the sampling operation differentiable with respect to encoder parameters
To reduce the latent dimension
Quick Check
What does the KL term in the VAE loss encourage?
Better reconstruction
The approximate posterior to be close to the prior (regularisation)
Faster training
Quick Check
In beta-VAE with beta > 1, what happens?
Better reconstruction quality
More disentangled latent space but blurrier reconstructions
Faster convergence
Common Mistake: KL Vanishing (Posterior Collapse)
Mistake:
The KL term drops to zero and the decoder ignores the latent code.
Correction:
Use KL annealing (warm up beta from 0 to 1), free bits, or cyclic annealing.
Common Mistake: BCE Loss Without Sigmoid
Mistake:
Using BCELoss on decoder output without constraining to [0,1].
Correction:
Add sigmoid to the last decoder layer, or use BCEWithLogitsLoss.
Common Mistake: Predicting sigma Instead of log(sigma^2)
Mistake:
Predicting sigma directly, which requires softplus to ensure positivity.
Correction:
Predict log(sigma^2) and use exp(0.5*logvar) for std. More numerically stable.
Key Takeaway
VAEs provide a principled probabilistic framework for generation. The ELBO balances reconstruction and regularisation. The reparameterisation trick makes training end-to-end differentiable.
Key Takeaway
Generative models learn to sample from the data distribution. VAEs are simple but produce blurry samples. GANs are sharp but unstable. Diffusion models offer the best quality but are slow.
Why This Matters: VAEs for Channel Model Generation
VAEs can learn to generate realistic wireless channel realisations from measured data. The latent space captures channel parameters (delay spread, angular spread) in a continuous representation, enabling interpolation between channel conditions.
Historical Note: VAE: Probabilistic Deep Learning
2013Kingma and Welling introduced the VAE in 2013, unifying variational inference with deep learning. The reparameterisation trick was the key insight enabling backpropagation through stochastic layers.
Historical Note: GANs: Adversarial Training
2014Goodfellow et al. introduced GANs in 2014, training a generator against a discriminator. The resulting min-max game produces sharp samples but is notoriously difficult to train.
VAE
Variational Autoencoder: generative model that learns a latent space via variational inference.
ELBO
Evidence Lower Bound: the objective maximised in VAE training. Lower bound on log-likelihood.
KL Divergence
Kullback-Leibler divergence: measures how one distribution differs from another.
Diffusion Model
Generative model that learns to reverse a gradual noising process.
GAN
Generative Adversarial Network: generator and discriminator trained in a min-max game.
Generative Model Comparison
| Model | Training | Sample Quality | Diversity | Speed |
|---|---|---|---|---|
| VAE | Stable (ELBO) | Blurry | High | Fast |
| GAN | Unstable (adversarial) | Sharp | Mode collapse risk | Fast |
| Diffusion (DDPM) | Stable (denoising) | Best | High | Slow (iterative) |
| Flow Matching | Stable (ODE) | High | High | Medium |