Question 1

What is a neural network in simple terms?

Accepted Answer

A neural network is a system of simple connected units, loosely inspired by brain cells, that learns patterns from examples. Each unit takes numbers in, multiplies them by adjustable weights, adds them up, and passes the result through a simple function. By tuning millions of weights against labelled examples, the network learns to map inputs — like an image — to outputs — like 'cat'. It is statistics and arithmetic at scale, not a literal brain.

Question 2

How does a neural network learn?

Accepted Answer

It learns by trial and error guided by maths. The network makes a prediction, a loss function measures how wrong it was, and an algorithm called backpropagation calculates how each weight contributed to the error. Gradient descent then nudges every weight slightly in the direction that reduces the error. Repeat this over millions of examples and the weights settle into values that produce good predictions.

Question 3

What is the difference between a perceptron and a deep neural network?

Accepted Answer

A perceptron is a single layer of neurons that can only learn simple, linearly separable patterns — it famously cannot even learn the XOR function. A deep neural network stacks many layers, letting it learn hierarchical features: early layers detect simple patterns and later layers combine them into complex concepts. Depth is what gives modern networks their power.

Question 4

How are transformers different from older neural networks?

Accepted Answer

Older networks processed sequences step by step, which was slow and forgetful over long inputs. Transformers use an attention mechanism that lets every element in a sequence directly weigh the relevance of every other element, all in parallel. This makes them far better at handling language and long-range context, and it is why they power modern large language models like GPT and Claude.

Neural Networks Explained: From Perceptron to Transformer

The basic building block: the neuron

The perceptron and its limits

Going deep: hidden layers and learning

Specialised architectures

The transformer revolution