What is the difference between AssistantAgent and UserProxyAgent?

AssistantAgent is the LLM-backed agent that reasons and writes responses or code. UserProxyAgent stands in for the human — it relays the task, optionally asks for human input, and is the agent that actually executes any code the assistant produces. Most AutoGen apps pair at least one of each.

How does code execution work?

When you set a code_execution_config on the UserProxyAgent, it extracts code blocks from the assistant's messages and runs them in the given work_dir, feeding the output back into the conversation. Set use_docker to true for isolation in production; the builder uses False for easy local testing.

When should I use a group chat instead of two agents?

Two agents are ideal for a single assistant solving a contained task. A group chat shines when a task benefits from specialisation — a planner that decomposes, a coder that implements, and a reviewer that checks. The GroupChatManager decides who speaks next based on the conversation.

What does human_input_mode control?

It governs when the UserProxyAgent pauses for you. NEVER runs fully autonomously up to the reply limit, ALWAYS asks you before each turn, and TERMINATE asks only when the agent thinks it is done. Use ALWAYS while debugging and NEVER for hands-off runs.

How do I stop runaway loops and cost?

Cap max_consecutive_auto_reply on the proxy and max_round on a group chat, and keep temperature low. Agents can otherwise loop politely forever, burning tokens. Setting clear termination conditions, such as a reviewer replying APPROVED, is the cleanest way to end a conversation.

How to Use AutoGen for Multi-Agent AI

What AutoGen is

AutoGen is Microsoft’s framework for building applications where multiple AI agents hold a structured conversation to solve a task. Instead of one model call, you define agents with distinct roles and let them message each other — one writes code, another runs it, a third reviews. AutoGen’s core insight is that complex work is easier to orchestrate as a conversation with clear turn taking than as a single mega-prompt. The builder above generates a runnable script for the two most common patterns.

How it works

Every AutoGen agent is a message handler. The two foundational types are:

AssistantAgent — backed by an LLM. It receives the conversation so far and produces the next message, which may contain prose or fenced code.
UserProxyAgent — represents the human. It starts the conversation with initiate_chat, optionally pauses for human input, and crucially is the agent that executes code blocks (via code_execution_config) and feeds results back.

For a single assistant, that two-agent loop is enough: the proxy sends the task, the assistant proposes code, the proxy runs it, errors come back, the assistant fixes them, and the loop continues until success or the reply cap.

For harder problems you add a GroupChat of specialist assistants — a planner, a coder, a reviewer — managed by a GroupChatManager that picks who speaks next. This separation of concerns produces noticeably better results on multi-step coding tasks than one do-everything agent.

Tips and gotchas

Always cap the loop. Set max_consecutive_auto_reply and max_round, and give the reviewer a clear termination signal (reply APPROVED) so the conversation actually ends.
Use Docker in production. use_docker=False is fine for local experiments, but executing model-written code on your host is risky — flip it on for anything real.
Keep temperature at 0 for code. Determinism makes debugging multi-agent runs far less maddening.
Start with two agents. Reach for a group chat only when a single assistant visibly struggles — more agents means more tokens and more ways for the conversation to go sideways.