Part 3: GPU Computing: CuPy and PyTorch Tensors

Chapter 10: GPU Computing Fundamentals

Intermediate~120 min

Learning Objectives

  • Explain GPU architecture (SMs, warps, memory hierarchy) and its implications for scientific code
  • Describe the CUDA programming model at the conceptual level
  • Set up CUDA, cuDNN, and Python GPU libraries on local and remote machines
  • Profile GPU code using nvprof and nsight to identify memory/compute bottlenecks

Sections

Prerequisites

💬 Discussion

Loading discussions...