Part 6: Deep Learning with PyTorch

Chapter 30: Attention and Transformer Architectures

Intermediate~150 min

Learning Objectives

  • Implement scaled dot-product attention and multi-head attention from scratch
  • Build a complete transformer encoder-decoder architecture
  • Implement Vision Transformer (ViT) for image classification and processing
  • Adapt transformers for scientific data: sequences, grids, and point clouds

Sections

Prerequisites

💬 Discussion

Loading discussions...