From Chatbot to Agent : When to Adopt MCP
Every organisation that adopts AI internally starts in the same place : A chatbot wrapped around an LLM, sitting behind a chat UI or a Slack integration, calling one or two APIs. The interesting question is when that shape stops being enough. This article gives the decision framework we use with clients moving from a chatbot to an MCP-backed agent platform.
The shape of a chatbot
A chatbot is a thin layer : User input goes to an LLM with a fixed system prompt, the LLM may call one of a small number of pre-declared functions, the result comes back. State, if any, is conversation history kept in memory or in a session store. The blast radius of a misbehaving model is bounded by the function set ; The audit story is "we logged the conversation".
For a focused use case, this is the right answer. It ships in a sprint, the team learns what the model is and isn’t good at, and the cost is contained. Most organisations should start here.
Signals that the chatbot shape is no longer fitting
Four signals consistently mark the threshold where the wrapper approach stops paying off.
The tool surface stops fitting in one function list. A chatbot with three tools is simple. The same chatbot with thirty tools, owned by four teams, evolving on different cadences, is a coordination problem. Tool definitions drift, the system prompt grows, and the model spends a measurable fraction of its context budget reading the tool list. An MCP server lets each team own its plugin, register tools cleanly, version them, and expose only the relevant subset to a given agent through RBAC.
The system needs to take real action, not just answer. Reading data is a read. Mutating systems, creating tickets, issuing transfers, sending messages are writes ; They demand a different operational discipline. They need confirmation flows, proposal queues, rollback paths, and audit guarantees that a chatbot does not provide by default. MCP-backed agents make those first-class : Proposals are explicit, the audit trail records the template version and the typed inputs, and the tool bridge enforces RBAC at every call.
The same capability needs to be reachable from more than one place. A chatbot lives in its chat UI. The moment the same business logic needs to be reachable from a desktop tool, a web app, a CI script, and the chat at once, the chatbot starts duplicating. MCP draws the line at the protocol : Tools are exposed once, every client speaks the same JSON-RPC, and a new front-end is a new adapter, not a new copy of the business logic.
The compliance and security teams want a real answer. "We logged the conversation" is acceptable for an experiment. For something users rely on, the question is which version of which prompt the model saw, which tools it had access to, what inputs were passed, and what actions it actually took. The mission-as-template model on top of MCP gives a deterministic answer to all four ; A bare chatbot gives an approximation.
The decision framework
Three concrete checks to run when the signals start firing :
- Count the tool calls per conversation. If most conversations resolve in two or three tool calls, the chatbot shape is fine. If the average is climbing past ten, the model is doing planning work that an agent loop will do more reliably and more cheaply.
- Count the distinct tool owners. Tools owned by one team can ship in a chatbot. Tools owned by four teams want a protocol with a registration story, RBAC, and independent versioning. That is what MCP provides.
- Look at the audit asks. If your compliance team is satisfied with conversation logs, you are not yet at the threshold. If they are asking for prompt-version provenance, capability boundaries per session, and replay from typed inputs, the platform conversation has started ; MCP plus mission-as-template is the answer.
What the migration looks like
A chatbot migration to an MCP platform is rarely a rewrite. The chatbot stays, with its UI and its session store ; What changes is what it talks to. The function definitions become MCP tools, the system prompt becomes a mission template, and the brain abstraction lets the existing LLM backend keep serving while new backends (local models, alternative providers) become drop-in replacements through the same port.
The first deployment usually keeps every existing capability on the same model, behind the new platform. The platform advantages (RBAC, audit, versioned missions, multi-client, proposal queue) come for free ; The next deployments start pulling specialised work to specialised modules, splitting teams cleanly, and exposing the same tools to new front-ends.
When to stay with the chatbot
A small team, a small tool surface, a low-stakes domain, and no compliance pressure : The chatbot is the right answer. The platform discussion is premature until the signals fire. Recognise that the choice is not between chatbot and platform forever ; It is about which shape fits the current scope and when to make the transition.
For the platform side of that transition, see Self-Hosted MCP Infrastructure for Enterprise, Building a ReAct Agent on Top of MCP, and Mission-as-Template : Declarative AI Agents in Production.
