What 'agentic AI' really means — and the three places it earns its keep

Every vendor pitch has a slide that says "agentic." Most of them mean "we wrote a prompt." A smaller subset mean "we wired a model to one tool." A still smaller subset mean what the term is supposed to mean. If you're going to spend money on this, you need to know which one you're buying.

This post is the working definition we use internally, and the three deployment patterns where the architecture actually earns its cost. Everything else is a chatbot wearing the wrong t-shirt.

The technical floor

A system is agentic when it has all four of these properties at once:

A goal it can articulate. Not a script — an objective. "Qualify this inbound lead" rather than "respond with template B."
Tools it can call. Real tools that change state in real systems — CRM writes, calendar holds, database reads, web fetches, file generation. Not just text-completion.
A feedback loop. It observes the result of each tool call and decides what to do next based on the result. The path is not predetermined.
A stop condition. It knows when the goal is met (or when to escalate to a human) and exits cleanly.

A model that drafts an email is not agentic. A model that decides whether to draft an email, drafts it, sends it, watches for a reply, and escalates if no reply lands in 48 hours — that's agentic. The agency is in the loop, not the language.

This matters because almost every "AI agent" being sold today is missing at least one of those four properties. Usually it's the feedback loop (the system runs once and stops) or the stop condition (the system runs forever and racks up token cost). Both are debilitating in production.

If you've already read Automation vs. AI: which one your business actually needs first, the agentic layer is what sits on top of automation when the next decision genuinely depends on what just happened — and you don't want to wake a human up to make it.

Where agentic AI earns its keep

Three patterns. Every successful agentic deployment we've shipped fits one of them. Anything outside these three is usually a worse, slower, more expensive version of either a normal automation or a normal model call.

1. Inbound triage with action authority

The job: read unstructured inbound (email, form text, voicemail transcript), decide what kind of inquiry it is, take the appropriate next action, and only escalate to a human when the case is genuinely ambiguous.

Why agentic and not just "AI": the action authority is the agent part. A model that classifies the inquiry is just a model. A system that classifies, then writes the lead to the right CRM pipeline, then drafts a tailored first reply, then books a discovery call if the inbound matches a high-intent pattern, then flags ambiguous cases for the founder — that's an agent. The path through that decision tree is different for every inbound, which is why a static automation can't do it.

What separates a working version from a broken one:

Bounded tool surface. The agent can do exactly what it needs to do and nothing else. No "general internet access," no shell, no arbitrary CRM writes. Each tool has a typed schema and the system rejects calls that don't match.
Human-in-the-loop on irreversible actions. Sending a contract, refunding a payment, or replying to a sensitive thread always passes through a person. The agent drafts, the human approves.
Observable state. Every decision the agent made on every inbound is logged in a way the founder can audit. If a lead got misrouted, you can see why.

Done well, this is the single highest-leverage agentic pattern for a services business. It compounds because every inbound makes the next reply faster.

2. Multi-step research that branches

The job: investigate something where the next question depends on the last answer. Competitive analysis, pre-call client briefing, document review, vendor research, technical due diligence.

Why agentic and not just "AI": research is recursive. You can't write a static prompt that says "find everything relevant" because what's relevant only becomes obvious once you've found the first thing. The agent reads, plans the next query, reads, refines, and stops when it has enough — exactly the loop a junior analyst would run, except in 90 seconds.

This is also where we see the strongest argument for prompt caching: the system prompt and the source materials don't change between turns, only the working memory does. With caching, every loop iteration is dramatically cheaper than the first, which is what makes the whole pattern viable on a real budget.

What separates a working version:

A defined output shape. "Produce a one-page brief with these five sections" beats "research and report." The output shape is the stop condition.
Source attribution. Every claim in the output points back to where it came from. Without this, the brief looks impressive and is unverifiable.
A budget. Token budget, time budget, or query budget — pick one and enforce it. Research agents without budgets are how you get a $400 invoice for one report.

3. Long-running ops loops

The job: monitor something over time, decide when to act, act, observe the result, and continue. Lead nurturing across weeks, knowledge-base maintenance, recurring report generation with anomaly flagging, support-ticket follow-up after the first reply.

Why agentic and not just "AI": time is the input. A static automation can run on schedule, but it can't decide whether the schedule is right. An agent watches state ("this lead has gone quiet for 9 days, and the last touch was a generic email"), decides to act ("send a tailored follow-up referencing the specific service they asked about"), and watches what happens next.

This is the pattern most often misimplemented as "agentic" when it isn't. A cron job firing the same template into the same channel every Tuesday is not agentic. A system that decides what to send, to whom, and when, based on a goal it's measured against — that's agentic.

What separates a working version:

Clear success metric. "Reply rate," "qualified meetings booked," "time-to-resolution." The agent needs a number to push against, not a vibe.
Throttling. The agent should not be able to spam. Hard caps on outbound volume per day per recipient.
Easy override. A human can pause the loop with one click and the loop respects it.

Where agentic AI does not earn its keep

Every pattern outside those three is something we've seen pitched as agentic and watched fail. The common ones:

"Chat with our docs." This is retrieval-augmented generation, not an agent. It's also fine — RAG is a great pattern for the right use case. Just don't pay agent prices for it.
"AI replaces our sales team." No. AI qualifies, drafts, and surfaces. Closing is still human work, and pretending otherwise burns trust with prospects who can tell when they're being run through a script.
"Agentic content production." Models can draft. Drafting is not the bottleneck — voice, taste, and editorial judgment are the bottleneck. An agent that publishes unmoderated content is one bad day from a brand crisis.
"General-purpose office agent." The agents that work are the ones with narrow scope and observable state. The general-purpose ones either don't ship, or ship and quietly stop being used because nobody trusts them.

The pattern: agency without bounds is a liability. Agency with a bounded tool surface, a defined goal, and observable state is leverage. The whole job of the engineering team is to keep the system in the second category.

How to evaluate a vendor pitch

When someone sells you "an agentic AI system," ask these in order:

What goal does it have? If they describe features instead of a goal, it's not agentic.
What tools can it call, and who can change that list? If the list is open-ended, that's a security and cost problem.
What does its decision log look like? If you can't audit a past run, you can't trust the system.
What's the stop condition? If the answer is "it stops when the user closes the chat," that's a chatbot.
What happens on a wrong action? If the answer involves a refund or an apology, demand a human-in-the-loop on that action class.

If the pitch survives those five questions, you're probably looking at a real system. If it doesn't, you're looking at a wrapper, and the wrapper is being priced like the system underneath.

What to do next

Most businesses don't need three agents. They need one — placed at the bottleneck where structured automation runs out of road, with a tight scope and a clear success metric. Pick the inbound, the research, or the loop. Ship it. Measure it. Then expand.

If you're trying to figure out which of the three patterns matches your bottleneck — or whether you're at the agentic stage at all yet — request a strategy call. We'll walk through the actual workflow and tell you what shape of system actually fits, including the cases where the answer is "not yet."

For founders heading into a vendor evaluation, also see How to brief an AI consultant for the questions to come prepared with. The cost of a bad agentic build is high enough that the briefing is worth doing carefully.