Model Export and Inference Optimization
Definition: TorchScript Export
TorchScript Export
# Tracing (for fixed control flow)
traced = torch.jit.trace(model, example_input)
traced.save('model.pt')
# Scripting (for dynamic control flow)
scripted = torch.jit.script(model)
scripted.save('model.pt')
Definition: ONNX Export
ONNX Export
torch.onnx.export(model, example_input, 'model.onnx',
input_names=['input'], output_names=['output'],
dynamic_axes={'input': {0: 'batch'}})
Example: Post-Training Quantisation
Quantise a model to INT8 for faster inference.
Solution
Implementation
model_int8 = torch.quantization.quantize_dynamic(
model, {nn.Linear}, dtype=torch.qint8)
Inference Speed Comparison
Compare latency across export formats.