Exercises

ex-sp-ch34-01

Easy

Implement a character-level tokenizer that maps a string to a list of integer IDs and back. Include <PAD> and <UNK> special tokens.

ex-sp-ch34-02

Easy

Compute the TF-IDF matrix for three wireless paper abstracts using scikit-learn. Find the most important term in each document.

ex-sp-ch34-03

Easy

Use the transformers library to tokenize the sentence "5G NR MIMO-OFDM beamforming" with GPT-2's tokenizer. Print the tokens and their IDs.

ex-sp-ch34-04

Easy

Create a nn.Embedding layer with vocabulary size 1000 and dimension 64. Verify that looking up token 42 returns the same result as multiplying the embedding matrix by the one-hot vector.

ex-sp-ch34-05

Easy

Compute cosine similarity between three pairs of words using pre-trained GloVe embeddings via torchtext.

ex-sp-ch34-06

Medium

Implement BPE training from scratch. Start with character-level tokens and perform 20 merges on a small corpus of 5 sentences. Show the vocabulary after each merge.

ex-sp-ch34-07

Medium

Train a Word2Vec skip-gram model with negative sampling on a corpus of 100 wireless paper titles. Visualize the learned embeddings with PCA.

ex-sp-ch34-08

Medium

Build a bigram language model from a corpus. Implement Laplace smoothing and compute perplexity on a held-out test set.

ex-sp-ch34-09

Medium

Implement scaled dot-product attention from scratch in PyTorch. Verify your implementation matches nn.MultiheadAttention on random inputs.

ex-sp-ch34-10

Medium

Build a simple RNN language model and train it on Shakespeare text. Generate 100 characters of text using temperature sampling.

ex-sp-ch34-11

Medium

Implement positional encoding (sinusoidal) and show that it produces unique position representations. Plot the encoding matrix as a heatmap.

ex-sp-ch34-12

Hard

Implement a complete transformer decoder block (self-attention + FFN + residual connections + layer norm) and train it as a character-level language model on a wireless textbook excerpt.

ex-sp-ch34-13

Hard

Train Word2Vec on a corpus of 3GPP specification abstracts and evaluate on a custom analogy test set (e.g., "UE is to downlink as gNB is to ___").

ex-sp-ch34-14

Hard

Implement multi-head attention from scratch (not using nn.MultiheadAttention). Show that splitting into heads and concatenating produces different attention patterns per head.

ex-sp-ch34-15

Hard

Build a retrieval system using sentence embeddings: encode paper abstracts with a pre-trained model, store in a vector database (FAISS), and query with natural language.

ex-sp-ch34-16

Hard

Implement the GloVe training objective from scratch using PyTorch. Train on a co-occurrence matrix built from a small corpus and compare the resulting embeddings with Word2Vec embeddings.

ex-sp-ch34-17

Challenge

Build a domain-specific BPE tokenizer for wireless communications. Train on 3GPP specs and IEEE papers. Compare token efficiency (tokens per document) against GPT-2's general-purpose tokenizer.

ex-sp-ch34-18

Challenge

Implement Flash Attention (the tiling algorithm) from scratch in Python/NumPy. Benchmark memory usage against standard attention for sequences of length 1024, 2048, and 4096.