Model Export and Inference Optimization

Definition:

TorchScript Export

# Tracing (for fixed control flow)
traced = torch.jit.trace(model, example_input)
traced.save('model.pt')

# Scripting (for dynamic control flow)
scripted = torch.jit.script(model)
scripted.save('model.pt')

Definition:

ONNX Export

torch.onnx.export(model, example_input, 'model.onnx',
                  input_names=['input'], output_names=['output'],
                  dynamic_axes={'input': {0: 'batch'}})

Example: Post-Training Quantisation

Quantise a model to INT8 for faster inference.

Inference Speed Comparison

Compare latency across export formats.

Parameters