Bot
Text in, text out.
Just the model. Like texting a friend who is clever but cannot open any app on their phone.
We ship AI agents for enterprise teams every week. We wrote this because vendor pages will never tell you not to buy, and agency pricing pages quote $3k to $300k without saying for what. This page gives you four things:
An AI agent is software that takes a goal, picks the tools it needs, and acts until the goal is met. It combines four pieces: a language model, a knowledge base, tools it can call, and the freedom to choose the order. A chatbot answers questions. An AI agent finishes tasks.
Remove any one piece and it stops being an AI agent. A model alone is a chatbot. A model plus knowledge with no tools is retrieval-augmented generation (RAG). A model plus tools with no autonomy is a workflow: the model fills in blanks of a flow you wrote.
A bot answers. An AI agent acts, and decides for itself which action to take.
That single sentence is the line that separates the two categories. Anthropic uses the same split in Building Effective Agents: a workflow is a system where humans hard-code the steps the model follows. An agent is a system where the model picks the steps. The agentic AI category sits on this side of the line.
If you want to see one running, our production-ready AI agents are four role-tuned examples (copywriting, marketing, design, HR) you can try before reading the rest of this page.
The four types of AI agents map to one image: a clever friend you text. Each level adds a capability, from single-agent reply to multi-agent orchestration with tool calling.
Text in, text out.
Just the model. Like texting a friend who is clever but cannot open any app on their phone.
We orchestrate the if-this-then-that.
The bot fires specific tools, but a human writes the flow. Like texting that friend who can also book your flight, but only because you said "now book the flight."
The model picks the steps.
It decides which tool to use and when. Like a friend who books the flight, notices the connection is tight, reschedules your morning meeting, and tells you after.
A team of specialists, on live data.
Several specialised agents talk to each other. Like a small ops team: one watches calendars, one books travel, one handles expenses, they sync without you in the loop.
Most enterprise buyers ask for L4 multi-agent systems. Most should start at L2 (bot plus tools) and earn their way up. The economics are kinder, the eval surface is smaller, and the failure modes are easier to catch.
When somebody says "build an AI agent," the request splits into three independent design choices: compute, data, and capability. You can move each dial without moving the others.
The model that runs the inference. From frontier API at one end, to a custom model trained on your data at the other. Sets the cost-per-call and the data-residency posture.
What the model reads. From just the prompt, to RAG over your docs, to live API access, to real-time streams with a structured map of how everything connects. Sets the answer quality.
What the model is allowed to do. From answering, to firing one tool we told it about, to picking its own tools, to a whole team of agents working together. Sets what it can actually finish.
The size and speed of data retrieval is the fourth, hidden dial. At L4 capability with L5 data you are building a control-room AI agent, not a chatbot. Latency and throughput become non-negotiable. For software work, that data layer is where graph-memory AI agents sit.
Like hiring a top consultant by the hour.
You rent the best model in the world by the call from Claude, GPT, or Gemini. Best answers, charged per use, your data leaves your perimeter. Fastest to ship.
$1 to $30 per million tokens, per provider.The same consultant, after a two-week onboarding.
Same hosted model, trained a little further on your data so it sounds like you and knows your edge cases. Still rented, now fluent in your business.
$0.50 to $3.20 per million training tokens, then standard inference.Hiring a full-time in-house specialist.
Llama, Mistral, DeepSeek, Qwen running on your own server through Ollama or vLLM. No API key, no per-call fee, no data leaving the building.
$1,500 to $2,500 per H100 per month, plus engineering.Raising a specialist from graduate.
Your own model, retrained on your data on a schedule. Most expensive to grow. Knows your world better than anyone else can.
$60k to $200k per training cycle, plus ongoing compute.The compute axis is also a data-control axis. L1 sends prompts to a third-party API (Claude, GPT, Gemini). L3 and L4 do not. For regulated work in legal, healthcare, or finance, data residency rules often pick the rung for you. A self-hosted legal AI agent is the standard pattern there.
What you type is all the model knows. Asking a stranger a question with no context.
You paste in two or three worked examples so it copies the pattern. Showing a new hire two emails before asking them to write the third.
The agent looks things up in your PDFs, wiki, or knowledge base before answering. A lawyer who checks the case files before responding.
The agent reads your CRM, calendar, or warehouse in real time. An assistant who can open your inbox while you are talking.
The agent reacts to events as they happen and uses a structured map of how things connect. A control-room operator watching live dashboards with the wiring diagram of the building in front of them. Spiderbrain is one such graph memory, built for software projects.
Three concepts live on this axis that most AI agent guides skip: memory (what the agent remembers across sessions, not just inside one, ranging from short context window to long-term vector stores), MCP (the Model Context Protocol, now the standard plug for connecting tools and data to any agent), and reflection (the agent critiquing its own drafts before shipping). Reflection is a free upgrade at any compute level and most projects skip it. For long-term memory across an entire team, second-brain AI agents are the standard pattern.
If a human has to read, decide, and route every time, that decision is what an agent eats. Deterministic workflows are cheaper as plain automation.
CRM, ticketing, warehouse, code. The agent needs a place to look that is canonical and up to date. Without one, RAG is the only option and the answers age fast.
At least a hundred occurrences a month, ideally a thousand. Below that, the eval and ops cost will out-weigh what the agent saves.
Without evals, you do not know when the agent regresses. Set aside ten to twenty percent of build cost per year for prompt and eval rework.
An agent that takes actions needs a human who owns its mistakes and approves its scope. Without one, the project drifts and stalls.
The honest part. Most buyers we talk to don't need a full AI agent. They need a workflow, a small bot, or to clean their data first. We'll tell you that to your face if it's true. If you want a second opinion on AI readiness, our opens a short estimator that tells you whether the engagement is worth a call.
The eval and on-call work will cost more than the time saved. Use a checklist or a script.
If the truth lives in someone's inbox or a tribal-knowledge spreadsheet, fix that first. An agent on bad data is worse than no agent.
If no one on your side will sign off on what the agent is allowed to do, the project will not ship. Get the owner before the engineers.
Some regulated environments require a human in the loop on every action. That is fine, but it changes the build: it is no longer an autonomous agent, it is a copilot.
These AI agent costs are 2026 bands from our own engagements and public benchmarks across McKinsey, Gartner, and vendor case studies. Two numbers per shape: the one-time build, and the monthly run-rate once it is live.
One scoped corpus, frontier API, light eval. Two to six weeks, one senior engineer.
Acts on one to ten systems. Eval harness, tool auth, observability. Six to fourteen weeks.
Open-weight model on your hardware. GPU procurement, vLLM serving, on-call. Ten to twenty weeks.
Several specialised agents on shared state. Coordination, audit-grade eval. Four to nine months.
On top of any of the above. The hidden cost is the dataset itself.
The hidden cost on top of all of these is the data work. If your source data is messy, scattered, or undocumented, you are looking at $20k to $150k of cleaning, deduping, and labelling before the agent build can even start. That data-prep cost is real and buyers underestimate it. Budget $20k to $150k before the build sprint starts. To compare our AI agent products and pricing side by side, every band above maps to one product line.
Nine questions, each one narrowing your AI agent cost band. We give a range, not a precise number, because anyone quoting a precise AI agent build cost in 2026 without scoping is selling you something.
McKinsey reports a 5.8x ROI on AI agent deployments that reach production inside 14 months. The catch is that only about a quarter of projects get to production. The other three quarters stall on data, governance, or evaluation.
Most software slows down with scale. Our agents speed up. Bigger knowledge base means stronger retrieval, fewer misses, sharper citations. The hundredth question is answered better than the first.
Every eval failure becomes a labelled example. Every counsel edit, every reviewer note, feeds the next retrain. The agent on day 90 is meaningfully better than day one, and we publish the score every month.
Your perimeter, your residency, your IAM. No prompts or outputs leave the boundary unless you decide they should. EU, India, US, sovereign cloud, on-prem. We pick the rung that fits your compliance, not the one that fits our stack.
Autonomous on the routine. A human approves anything that spends money, signs paper, or changes a customer's record. The hallucination tax stops at the gate, not inside the inbox.
Prompt caching, batching, hybrid routing to the cheapest model that meets the quality bar. We track the bill weekly so a price change from a frontier provider does not become a finance call you have to make.
Eval harness from week one. Observability from week one. On-call runbook from week one. The interesting work happens because the boring work is already done.
The pattern is consistent. Anything tied directly to revenue or ticket deflection pays back fastest. A sales development AI agent hits the 3-month band most often. Anything indirect, like productivity, knowledge access, or internal comms, pays back slower with messier attribution.
Pick a product if the use case is generic and the vendor has already built it. Cheapest path. You give up customisation and you live inside the vendor's eval surface. Our ready-made agents are an example of this path.
Fast · low controlPick this if you have an ML team, proprietary data, and the patience to build the eval and governance from scratch. Longest payback, highest ceiling, only path for some regulated industries. If you want a second pair of eyes on the architecture, AI agent consulting is the engagement shape we use for that.
Slow · full controlPick this if you need it shipped in a quarter and you do not want to staff up. Total cost of ownership usually favours the partner route for the first agent. Solution Partners is the program we run for agencies that want the same.
Fast · shared controlEvery AI agent engagement opens with a two-week AI agent discovery sprint. You leave it with a written plan: the use case in numbers, the shape of the agent, the data sources it needs, the eval rubric, the risk register, and a price. Then you decide whether to continue. If we conclude the project should not happen, we say so, and you do not pay for the build sprint that follows.
Two weeks. Use case, data audit, eval rubric, written plan.
Two-week sprints, weekly demos, written summary each Friday.
Eval harness, guardrails, observability, on-call runbook.
Monthly check-ins, prompt and eval rework, model upgrades.
An AI agent is software that takes a goal, picks the tools it needs, and acts until the goal is met. The difference from a chatbot is action. A chatbot answers questions; an agent finishes tasks.
A chatbot replies. An AI agent decides and executes. Agents can read your CRM, call 5+ tools in a single goal loop, write to a database, and keep going until the goal is met. A chatbot stops at the answer.
A focused single-purpose agent runs $15,000 to $60,000 to build. A multi-tool agent with CRM and ERP integration runs $40,000 to $180,000. A multi-agent workflow runs $150,000 to $400,000. Monthly run-rate is $600 to $80,000 depending on scale.
If the work is rules-based and deterministic, use plain automation. You only need an AI agent when the path to the answer changes per case, and a human normally has to read, decide, and route.
A scoped pilot ships in four to eight weeks. Production hardening with evals, guardrails, and observability takes another six to twelve weeks. Industry median time-to-value is around five months.
5.8x ROI within 14 months on AI agent investments that reach production, per McKinsey. The catch is that only about a quarter of projects hit that bar, which is why scoping matters more than the tech.
ChatGPT on its own is a chat interface. ChatGPT with tools, memory, and a goal loop is an AI agent. The brand name does not decide; the architecture does.
Hallucinated tool calls, prompt injection, runaway token cost, and unauthorised actions on connected systems. Most projects fail on governance rather than on the model.
Buy off-the-shelf if the use case is generic. Build in-house if you have an ML team and proprietary data. Hire a partner if you need it shipped in a quarter and you do not want to staff up. The partner route is typically 30 to 50 percent cheaper than in-house for the first agent on total cost of ownership.
Sales development agents median 1.7 months. Customer-support deflection 2 to 3 months. Finance and ops agents around 4.5 months. Anything tied directly to revenue or ticket deflection pays back the fastest.
Yes. A FAQ-grade agent runs $5,000 to $20,000 to set up and $50 to $200 per month. The mistake is paying that and skipping the evaluation harness, which is what causes silent regressions.
Decline the project if the workflow runs fewer than 50 times a month, if the data is not in a system of record, if there is no owner accountable for the agent's actions, or if compliance does not allow autonomous writes.
Send us the use case, however rough. Two weeks later you have a written plan and a price. If we tell you to walk away from it, you walk away with the plan as the parting gift.