What Is Tool Use in AI? How LLMs Call APIs and Run Code

Function calling, code interpreters, and web search: AI beyond text

Ad placeholder (leaderboard)

Why models need tools

A language model on its own can only generate text from what it learned during training. It has no live web access, cannot run a calculation reliably, and cannot take real actions like sending an email or querying your database. Tool use closes that gap. By giving the model a set of functions it can call, you let it fetch current information, perform exact computation, and act on external systems — turning a text predictor into something that can do real work grounded in the present.

How tool use actually works

Crucially, the model does not execute anything itself. The flow is a hand-off. You describe the available tools to the model. When it decides one is needed, it emits a structured request naming the tool and its arguments. Your surrounding code runs that tool and returns the result back to the model as a new message. The model reads the result and continues — answering the user, or calling another tool. This request-execute-return cycle is the foundation of every agent and code-interpreter feature you have seen.

Function calling and JSON mode

The reliable way to do this is function calling. You define each function with a schema — name, description, and typed parameters — and the API guarantees the model’s call conforms to that schema. This is stronger than asking for JSON mode, where the model merely produces JSON-shaped text you must parse and validate yourself. With function calling, the structure is enforced, so you can wire the model’s output straight into code without defensive parsing. Function calling is the recommended path whenever the model needs to trigger an action.

OpenAI and Anthropic formats

The idea is universal but the wire formats differ. OpenAI takes a tools array of function schemas and, when the model wants a tool, returns tool_calls you execute and answer. Anthropic uses tool definitions and returns tool_use content blocks, which you respond to with matching tool_result blocks. Both expect JSON schemas for arguments and both let the model chain several tool calls in a conversation. Agent frameworks like LangChain typically wrap these so you define a tool once and it works across providers.

Common tools and good design

The most common tools are web search (for current facts), code interpreters (for exact math, data analysis, and file handling), and API or database calls (for actions and private data). Two design choices make tool use work well: write clear, specific tool descriptions so the model picks correctly, and return concise, well-structured results so it can reason about them. Tool use is what takes an LLM beyond a clever autocomplete and makes it a component that can search, compute, and act in the real world.

Ad placeholder (rectangle)