Programmatic Figure Generation
Reproducible Figure Scripts
Every figure in a paper should be generated by a script, never by manual GUI interaction. A figure generation script takes simulation data as input and produces a publication-ready PDF as output. This makes figures reproducible, version-controllable, and trivially updatable when simulations are re-run.
Definition: Anatomy of a Figure Generation Script
Anatomy of a Figure Generation Script
A well-structured figure generation script has four parts:
# 1. Configuration
STYLE = {'font.family': 'serif', 'font.size': 8, ...}
OUTPUT_DIR = Path('figures/')
# 2. Data loading
df = pd.read_csv('results/ber_sweep.csv')
# 3. Plotting
def make_ber_figure(df):
fig, ax = plt.subplots(figsize=(3.5, 2.5))
# ... plotting code ...
return fig
# 4. Export
if __name__ == '__main__':
plt.rcParams.update(STYLE)
fig = make_ber_figure(df)
fig.savefig(OUTPUT_DIR / 'ber.pdf', bbox_inches='tight')
Definition: Makefile for Automated Paper Building
Makefile for Automated Paper Building
A Makefile orchestrates simulation, figure generation, and
LaTeX compilation:
FIGURES = figures/ber.pdf figures/constellation.pdf
TABLES = tables/results.tex
paper.pdf: paper.tex (TABLES)
pdflatex paper.tex
bibtex paper
pdflatex paper.tex
figures/ber.pdf: scripts/plot_ber.py results/ber.csv
python scripts/plot_ber.py
tables/results.tex: scripts/gen_tables.py results/sim.csv
python scripts/gen_tables.py
results/ber.csv: scripts/simulate.py
python scripts/simulate.py
Theorem: Simulation-Visualization Separation
A reproducible paper separates three concerns:
- Simulation produces raw data (CSV, HDF5, NumPy)
- Visualization reads raw data, produces figures (PDF, PGF)
- Document includes figures and tables, produces paper
Changes to visualization (e.g., colors, labels) do not require re-running simulations. Changes to document text do not require re-generating figures.
This is the Model-View-Controller pattern applied to scientific papers. Simulation is the model, figures are the view, and the Makefile is the controller.
Example: Complete Figure Generation Pipeline
Build a pipeline that runs a BER simulation, saves results to CSV, and generates an IEEE-formatted figure from the data.
Step 1: Simulation script
# simulate_ber.py
import numpy as np
import pandas as pd
from scipy.special import erfc
snr_db = np.arange(0, 21)
snr = 10**(snr_db / 10)
results = pd.DataFrame({
'snr_db': snr_db,
'ber_bpsk': 0.5 * erfc(np.sqrt(snr)),
'ber_qpsk': 0.5 * erfc(np.sqrt(snr)),
'ber_16qam': 3/8 * erfc(np.sqrt(2*snr/5)),
})
results.to_csv('results/ber_sweep.csv', index=False)
Step 2: Figure script
# plot_ber.py
import pandas as pd
import matplotlib.pyplot as plt
IEEE_STYLE = {
'font.family': 'serif', 'font.size': 8,
'figure.figsize': (3.5, 2.625),
'savefig.dpi': 600,
}
plt.rcParams.update(IEEE_STYLE)
df = pd.read_csv('results/ber_sweep.csv')
fig, ax = plt.subplots()
for col, fmt in [('ber_bpsk','o-'), ('ber_qpsk','s--'),
('ber_16qam','D:')]:
label = col.replace('ber_', '').upper()
ax.semilogy(df['snr_db'], df[col], fmt, label=label, ms=3)
ax.set(xlabel=r' (dB)', ylabel='BER')
ax.set_ylim(1e-6, 1)
ax.legend(loc='lower left')
ax.grid(True, which='both', alpha=0.3)
fig.savefig('figures/ber.pdf', bbox_inches='tight')
Example: Batch Generation of All Paper Figures
Generate all figures for a paper with a single script.
Implementation
# generate_all_figures.py
from pathlib import Path
import matplotlib.pyplot as plt
import pandas as pd
OUTPUT = Path('figures')
OUTPUT.mkdir(exist_ok=True)
STYLE = {'font.family': 'serif', 'font.size': 8,
'figure.figsize': (3.5, 2.625)}
def make_figure_1():
# ... plotting code ...
pass
def make_figure_2():
# ... plotting code ...
pass
FIGURES = {
'fig1_ber.pdf': make_figure_1,
'fig2_constellation.pdf': make_figure_2,
}
if __name__ == '__main__':
plt.rcParams.update(STYLE)
for name, func in FIGURES.items():
print(f'Generating {name}...')
fig = func()
if fig:
fig.savefig(OUTPUT / name, bbox_inches='tight')
plt.close(fig)
print('All figures generated.')
Paper Pipeline Visualizer
See how changes in simulation parameters propagate through the data β figure β document pipeline.
Parameters
Paper Build Process Animation
Watch the sequential build process: simulate, generate tables, generate figures, compile LaTeX.
Parameters
Why This Matters: Reproducible Research in Wireless
IEEE Signal Processing Society now requires authors to submit
code alongside papers. A well-structured figure generation
pipeline β where make paper regenerates the entire paper from
raw simulation data β satisfies this requirement effortlessly.
This is especially critical for wireless system simulations where
BER curves with subtle differences can lead to different conclusions.
Key Takeaway
Every figure and table should be generated by a script. Manual figure creation is a reproducibility anti-pattern. Use a Makefile to orchestrate simulation β data β figures β document compilation.
Historical Note: Literate Programming and Reproducibility
1984Donald Knuth introduced literate programming in 1984, where code and documentation are woven together in a single source. This philosophy influenced Jupyter notebooks, R Markdown, and modern reproducible research practices. The programmatic figure generation advocated here is a direct descendant of Knuth's vision.
Common Mistake: Hardcoded Absolute Paths
Mistake:
Using absolute paths like /home/user/project/figures/ in scripts,
which break on any other machine.
Correction:
Use pathlib.Path with relative paths or environment variables:
from pathlib import Path
OUTPUT = Path(__file__).parent / 'figures'
Quick Check
What is the primary benefit of using a Makefile for paper generation?
It makes LaTeX compile faster
It tracks dependencies and only regenerates what changed
It automatically fixes LaTeX errors
It is required by IEEE
Make only re-runs targets whose dependencies have been modified, saving time and ensuring consistency.
Makefile
A build automation file that specifies dependencies between targets and the commands to produce them, ensuring reproducible builds.
Idempotent
A property of a process where running it multiple times produces the same result as running it once.