Your AI Business Case Is Missing 70% of the Costs

Token Costs Are Just the Tip of the Iceberg — most AI business cases model less than 15% of total cost of ownership

I've been developing AI-driven financial platforms — issuing, payments, and digital banking — at enterprise scale, and across many other sectors. I've watched more AI business cases fail than succeed. Not because the AI didn't work, but because nobody accounted for the real cost of making it work.

Token pricing was modeled. Infrastructure was budgeted. But the prompt-engineering iterations, the model migrations when providers deprecated versions, the knowledge-base curation that required actual humans with domain expertise, the compliance reviews mandated by financial regulators — none of it was in the spreadsheet.

The result? Over-committed budgets. Under-resourced operations. Expensive shelfware that started as a brilliant proof-of-concept. To tackle this, I put together a framework.

The Blind Spot

Most organizations model AI costs with a simple equation: tokens consumed × price per token. For a single API call, that math works fine. For an enterprise agentic system — where one "run" chains 20–50 LLM calls across reasoning, tool use, validation, and retry loops — it's a trap.

Here's what the token-only model misses:

The agentic multiplier. A migration agent I helped build averaged 22 LLM calls per run — after optimization. One run is not one API call. It's a cascade.
Human-in-the-loop costs. In regulated financial services, virtually every agent output requires human review. At $110/hr loaded cost and 1.5 minutes per run, HITL costs can exceed token costs by 3–5x. This doesn't appear in any vendor pitch deck.
Day-2 operations. Model migrations, prompt drift, knowledge-base curation, incident response — these add 20–35% to run-rate costs, and they're never in the Year 1 budget.
The risk tax. Hallucination remediation, model validation, audit preparation, PII handling — in financial services these aren't optional. They're the cost of doing business responsibly.

When I add it all up, token costs typically represent less than ~15–20% of the true total cost of ownership. If your business case starts and ends with API pricing, you're making a capital-allocation decision with less than a sixth of the data.

Cost composition: tokens ~15% above the waterline; HITL review ~25-35%, Day-2 ops, risk and compliance, and infrastructure below it — Where the cost actually sits — tokens are the tip; ~85% hides beneath the waterline.

The Cost of a Single Run

Modeled at 4,000 runs per month with a 22-call agentic multiplier, the economics invert what most decks assume: human review, not tokens, dominates the per-run cost.

Bar chart of true cost per agent run: HITL review $2.75 dwarfs token and model $0.55, with a total of $5.37 per run — True cost per run — HITL review dwarfs token cost by roughly 5x.

Eight Dimensions Your Business Case Is Missing

Over the past couple of years, I've identified eight distinct cost dimensions that together constitute the complete economics of an enterprise AI agent.

Eight dimensions of agent total cost of ownership, from Run-Rate Economics to the 36-month TCO Timeline — The eight dimensions — from first prompt to production retirement.

Run-Rate Economics. Tokens, RAG, HITL, orchestration, infrastructure. The visible layer — but even here most models miss critical components.
Build & Pre-Production. $80K–$200K in prompt engineering, evaluation, data preparation, integration, and security review. Typically absorbed by engineering budgets and never attributed to the agent.
Fleet Portfolio. Enterprises deploy fleets, not single agents. Shared infrastructure, inter-agent communication, and cross-subsidization create portfolio dynamics.
Day-2 Operations. Model migrations, prompt drift, knowledge-base curation, incident response. The most underestimated dimension I've encountered.
Risk & Compliance. Model validation, audit preparation, PII handling, hallucination remediation. Non-negotiable in regulated industries.
Scaling & Portability. Volume discounts, caching economics, rate limiting, and vendor lock-in. Costs don't scale linearly.
Value & Opportunity. Quality improvements, speed-to-market, capacity reallocation, customer experience. This is where 2–5x of the value hides beyond labor savings.
TCO Timeline. A 36-month lifecycle view of cumulative investment vs. return and program-level breakeven. The only metric that matters for capital allocation.

Three Things I Wish Someone Had Told Me

HITL costs dominate token costs. In every agent I've deployed in financial services, human review was the largest single cost component. Reducing review time by 30 seconds per run shifts economics more than a 50% token-price reduction. Start there.
Day-2 operations quietly consume a huge share of the budget. A single model migration consumed 45 engineering hours and $3,200 in eval costs. We had three in one year — $135K+ that wasn't in anyone's original budget. Allocate 20–35% of Year 1 run-rate for ongoing operations, or your business case is incomplete.
Labor savings alone undervalue the investment. When I compressed delivery timelines from 12–18 weeks to 4–5 weeks, the value wasn't just "minutes saved." It was faster revenue recognition, fewer errors, redeployed talent doing higher-value work, and measurably improved customer experience. If your business case only models minutes removed, you're leaving the strongest argument on the table.

The Question That Changes Everything

Most business cases ask: "Does this agent save time?" The answer is almost always yes.

The better question is: "At what volume does the total value — across labor, quality, speed, capacity, and customer experience — exceed the total cost — across tokens, infrastructure, build amortization, Day-2 operations, risk, and compliance?"

That's a fundamentally different question. And it's the one that separates organizations that adopt AI smartly from those that adopt it fast and regret it later.

The Full Framework

I've written a comprehensive white paper that unpacks all eight dimensions with specific formulas, cost ranges, decision frameworks, and the practitioner insights I've accumulated building AI agents in production financial environments. It includes:

Detailed breakdowns of each cost dimension with real-world ranges.
The formulas behind effective token cost, break-even analysis, and hallucination exposure.
A risk-scoring matrix for enterprise AI agents.
A decision framework for build, scale, optimize, or retire decisions.
The eight dimensions of agent total cost of ownership.

If you're building, buying, or evaluating enterprise AI agents — especially in financial services — send me a message if you'd like to learn more.

A Final Thought

AI agents are not a cost center. They are a transformation engine.

The organizations that will win the AI era aren't the ones that adopt fastest. They're the ones that adopt smartest.

All figures, percentages, and cost ranges shown are illustrative and based on generalized enterprise scenarios. Actual costs vary significantly by organization, industry, agent complexity, provider pricing, and regulatory environment. Use these as directional benchmarks — not actuals — and calibrate to your own context.