Question 1

Why is AI trained on GPUs instead of CPUs?

Accepted Answer

Neural networks are mostly large matrix multiplications, which are made of thousands of identical, independent multiply-add operations. A CPU has a handful of powerful cores tuned for sequential logic, while a GPU has thousands of simpler cores that run those operations in parallel. For this specific workload a GPU can be tens to hundreds of times faster than a CPU.

Question 2

What is CUDA and why does it matter?

Accepted Answer

CUDA is NVIDIA's programming model and software stack that lets developers run general-purpose parallel code on NVIDIA GPUs. It matters because nearly every major AI framework — PyTorch, TensorFlow, JAX — is built on top of CUDA libraries like cuDNN. This software lock-in is a large part of why NVIDIA dominates AI hardware, not just the chips themselves.

Question 3

Why do AI GPUs like the H100 cost so much?

Accepted Answer

Data-centre GPUs such as the H100 are expensive because of high-bandwidth memory (HBM), specialised tensor cores, fast interconnects like NVLink, low production volume, and very high demand from AI labs. The chips are also sold with the full CUDA software ecosystem and support, which adds value beyond the raw silicon.

Question 4

Do you need an NVIDIA GPU to use AI?

Accepted Answer

No. To use AI through an API such as OpenAI or Anthropic you need no GPU at all — the provider runs the hardware. You only need a powerful GPU if you want to train models or run large models locally. Alternatives also exist, including Google's TPUs, AMD's accelerators, and Apple's unified-memory chips for smaller local models.

Why Does AI Run on GPUs? NVIDIA's Role in the AI Revolution

Why neural networks love parallel hardware

How CUDA made GPUs programmable for AI

What makes a data-centre AI GPU special

The wider accelerator landscape