Question 1

What is a recurrent neural network in simple terms?

Accepted Answer

An RNN is a neural network that reads a sequence one step at a time and keeps a running summary called a hidden state. At each step it combines the current input with the hidden state from the previous step, so information from earlier in the sequence can influence later outputs. This made RNNs a natural fit for text, speech, and time-series data.

Question 2

What is the vanishing gradient problem?

Accepted Answer

When an RNN is trained on long sequences, the error signal has to flow backwards through many repeated multiplications. If those multipliers are smaller than one, the signal shrinks exponentially until it effectively vanishes, so the network cannot learn long-range dependencies. The opposite case, exploding gradients, can also happen.

Question 3

How do LSTMs and GRUs improve on vanilla RNNs?

Accepted Answer

LSTMs and GRUs add gating mechanisms that decide what information to keep, forget, and output at each step. By giving the network a protected memory cell and learned gates, they let gradients flow more stably across many time steps, which made it practical to model much longer sequences than a plain RNN could.

Question 4

Why did transformers replace RNNs?

Accepted Answer

RNNs must process a sequence step by step, which is slow and still struggles with very long-range dependencies. Transformers use attention to look at every position in parallel, training far faster on modern hardware and capturing distant relationships directly. That speed and quality advantage made transformers the default for most language tasks.

What Is an RNN? Recurrent Neural Networks and Sequence Modelling

The core idea

How the hidden state carries memory

The vanishing gradient problem

LSTMs and GRUs to the rescue

Why transformers took over