From chatbot to agent
A standard chatbot is reactive: you send a message, it replies, and it waits. An AI agent is given a goal rather than a single question, and it pursues that goal autonomously. The key shift is the loop. Instead of stopping after one response, the agent decides on an action, carries it out, observes what happened, and uses that observation to decide its next action — repeating until it believes the goal is achieved or it hits a limit. The language model is no longer just a text generator; it is the decision-maker steering a process.
The four building blocks
Most agents are built from four parts. A planner turns a high-level goal into a sequence of concrete steps. Tools give the agent ways to affect the world — web search, running code, calling an API, reading a file. Memory lets it carry information between steps and across sessions, from a short scratchpad of recent actions to a long-term vector store of past work. And a control loop ties these together, feeding each tool result back to the model so it can choose the next move. The LLM sits at the centre as the reasoning engine that picks tools and judges progress.
How tool-calling drives action
An agent acts on the world through tool-calling. The model is told which tools exist and what arguments each accepts, usually as structured schemas. When it decides to use one, it emits a structured request — for example, a search query or a snippet of code — which the surrounding program runs. The result is returned to the model as an observation. This loop of decide, act, observe is what lets an agent gather information it did not start with and adapt its plan as it learns.
Real systems and patterns
Several well-known systems put these ideas into practice. AutoGPT popularised the idea of an agent that recursively plans subtasks toward a top-level goal. LangChain agents provide a framework for wiring LLMs to tools and running the reasoning loop. Anthropic’s Computer Use lets a model operate a computer by viewing the screen and issuing clicks and keystrokes as tool calls. Beyond single agents, multi-agent systems assign roles — a planner, a researcher, a critic — and let specialised agents hand work to each other, trading extra coordination cost for higher quality on complex jobs.
Strengths, limits, and good practice
Agents shine on tasks that are too multi-step for a single prompt but still bounded: researching a question, refactoring code across files, or pulling data from several sources. Their weakness is compounding error — across a long run, a small early mistake can derail everything, and agents can loop or hallucinate actions. The practical answers are to scope the goal tightly, give the agent high-quality tools, cap the number of steps, and insert human checkpoints for anything costly or irreversible. Treat the agent as a capable but fallible junior that does best with clear boundaries.