Playbook · Long read · 22 min

The economics of enterprise AI.Six shifts reshaping technology strategy.

The hard part of enterprise AI isn’t the technology; it’s the economics. Drawing on two decades leading transformation across global enterprises, this is the field guide to the six shifts that decide whether AI ever reaches your P&L.

How to read this · Linear, six acts. Or use the four paths below to jump to the act that matches your seat.

Jimi Li·02 Jun 2026·22 min read

81% of organizations deploying AI have yet to report meaningful bottom-line gains.

81%

of organizations report no meaningful bottom-line gains from AI

McKinsey · 10,000+ leaders

$6.31T

global IT spending in 2026

The money is flowing

13.5%

growth, driven almost entirely by AI infrastructure

The returns aren’t

That’s from McKinsey’s State of Organizations 2026. The technical part - choosing models, building pipelines, deploying agents - is the easy part now. The part that matters is economics: knowing where value shows up and where it doesn’t, and getting the cost structures right before you scale the wrong thing.

I’ve watched this movie before. Twenty years ago I saw enterprises pour money into “digital transformation” without understanding what they were buying. Half those initiatives failed. The patterns that killed them - hero leaders, disconnected use cases, activity-based metrics, hostile cultures - are already forming in AI. The difference this time: higher stakes, compressed timeline.

For mid-to-large enterprises, the AI pivot isn’t a technology upgrade. It’s a fundamental recalibration of IT economics - six interconnected shifts:

01Token cost governance. The new unit economics of AI compute
02Build vs. buy realignment. Why 76% of AI is now purchased
03SaaS pricing evolution. The death of seat-based models
04Budget & resource allocation. Run vs. Change strategy
05Technical debt & modernization. AI as a debt-reduction tool
06Human capital & workforce. The reality most leaders get wrong

Four ways in

Pick the path that matches your seat.

01Token cost governance

The token is the new unit of compute.

A single user prompt can cost a penny - or several thousand dollars. Most organizations are flying blind.

The token pricing spread - 2,800× variance between AI models, from $0.06/M tokens (budget LLM) to $168/M tokens (frontier LLM).

I spent an hour last month with a CFO who couldn’t explain why his AI costs tripled in Q1. His team had deployed a handful of copilots. Nothing dramatic. But nobody was watching the meter.

This is the new normal, and most organizations are flying blind. As enterprises scale generative AI, the fundamental unit of compute - the token - has become a primary economic discipline.

Traditional SaaS costs were predictable: Cost = Headcount × Price per Seat. Today, spending is decoupled from human users and tied to token consumption. One employee summarizing email might burn 10,000 tokens; another running reasoning tasks in the same seat burns 10 million - a 1,000× spread for identical “seats.”

Fig 02·Cost per task, by model class

Task	Model class	Est. cost
Employee onboarding query	Budget-tier LLM	$0.10
Basic case management	Mid-tier LLM	$0.30
Field service scheduling	Standard LLM	$0.60
Complex legal reasoning	Frontier LLM	$7.50
Agentic codebase scan	Reasoning LLM	$300.00

The 2,800× pricing variance

Here’s the number that should wake up your finance team. Per AI Cost Check’s February 2026 analysis, per-million-token pricing ranges from $0.06 to $168 - a 2,800× spread.

Without a routing layer, every query defaults to the most capable, most expensive model. A simple FAQ lookup doesn’t need a frontier reasoning engine - but that’s what it gets. The queries don’t look expensive one at a time. Then the invoice arrives.

The agentic multiplier

While conversational AI scales linearly, agentic AI scales geometrically. A single trigger initiates planning, sub-agents, and recursive loops - agents calling agents calling agents - producing thousands of hidden token transactions the original user never sees.

One healthcare enterprise racked up $6 million in unplanned costs in six months from unmanaged agentic loops before finance could intervene. Inside Microsoft, Meta, and Amazon, “tokenmaxxing” - feeding entire codebases into reasoning engines to hit adoption mandates - has pushed individual monthly developer bills past $150,000.

→ Try the widget: pick a task to see the cost multiplier.

Token FinOps

To survive this volatility, enterprises need Token FinOps on four pillars: real-time telemetry (dashboards, not monthly invoices), granular cost attribution (tag requests to teams and workflows), outcome maxing (value per token, not raw volume), and intelligent model routing (send each task to the cheapest model that can deliver).

“A vibe-coded system of record is a lawsuit waiting to happen.”

On the limits of build, in Act 02.

02Build vs. buy, realigned

The binary is obsolete.

The decision isn't build or buy. It's where you sit on a five-point spectrum - and which posture fits which capability.

The five postures spectrum: Buy, Configure, Extend, Compose, Build - from lower cost/risk to higher differentiation.

In 2024, the build-vs-buy split was nearly balanced (47% vs. 53%). By late 2025 the ratio flipped hard: 76% of all AI use cases are now purchased rather than built internally (Menlo Ventures, 2025).

I call it the Great AI Flip - and it’s an admission of failure for custom builds that turned out to be academic projects, not production tools. But the decision was never really binary. It’s where you sit on a spectrum (I unpack each posture in Build vs. Buy: the spectrum is dead):

Buy. Vendor product as-shipped. Your process conforms to the software.
Configure. Live inside a platform (Salesforce, ServiceNow, Workday). Their framework, your logic.
Extend. Keep a trusted core; build differentiating capability on top via APIs and agents.
Compose. Orchestrate multiple systems where no single vendor owns the truth.
Build. Own the full stack. The capability is the business.

AI coding moved the optimal posture one step right - not all the way to Build

Here’s what the hype gets wrong. AI coding collapsed the cost of building edges and glue. It did not collapse the cost of building cores. ERP, HCM, EHR, billing, general ledger - still dominated by domain depth, compliance burden, and multi-decade data models. The real sweet spot AI unlocked is Extend: bolt-on differentiation on top of trusted cores, without ripping out the system of record. Three questions decide the posture:

Is this capability strategic advantage, or table stakes?
Who absorbs liability when it breaks?
Is the data uniquely yours, or is someone else’s dataset the moat?

Sourcing trends

76% purchased. Three-quarters of AI use cases are now covered by vendor solutions rather than internal builds.

The pilot graveyard. The numbers vary by source but the pattern is consistent - and brutal. I’ve seen organizations with 30+ active pilots and zero production deployments. That’s not innovation; it’s expensive theater.

Model obsolescence risk. Internal 18–24 month build cycles can’t keep pace with exponential model evolution. By the time you ship, the foundation has shifted. Vendors absorb that risk; internal builds don’t.

Organizational readiness - be honest about it

Before deciding to build, assess your real capacity. Most organizations overestimate all three.

AI maturity. Early-stage firms lack the MLOps discipline to maintain custom builds.
Talent scarcity. Can you actually hire - and keep - the ML and platform engineers a bespoke build needs? The best people want interesting problems at scale. That’s a constraint to design around, not a criticism.
Infrastructure. Clean pipelines, governance, and scalable compute are prerequisites, not nice-to-haves.

Fig 05·Eight traps of AI-assisted dev

Velocity illusion. Code outpaces safe review.
Specification mirage. Polished output hides bad requirements.
Architectural drift. Local optimization, global incoherence.
Security amplification. Insecure patterns propagate faster.
Black-box dependency. Shipping code no one reasoned through.
Edge-case blind spot. Tests mirror the implementation.
Tech-debt acceleration. Cheap to create, costly to maintain.
Skills atrophy. Less practice on what matters when AI fails.

Win condition: manage complexity best - not generate code fastest.

The hidden cost of moving right

AI-assisted “vibe coding” enables rapid prototyping but creates a dangerous illusion: the final 20% of a production release - security, governance, reliability - is 80% of the engineering effort. Teams celebrate a working prototype, then spend six months making it production-ready. The demo looked 80% done. It was 20%.

AI reduces the cost of writing code. It does not reduce the cost of managing complexity. Eight traps emerge - and they compound the further right you move on the spectrum. I take each one apart in The 8 traps of vibe coding.

Fig 06·Posture economics

Posture	Upfront	Time-to-market	Maintenance	Debt risk	Obsolescence
Buy	Low	Days–weeks	Vendor	Low	Vendor
Configure	Low–Med	Weeks–months	Shared	Low–Med	Mostly vendor
Extend	Medium	3–6 months	Shared	Medium	Split
Compose	Med–High	6–12 months	You	Med–High	Mostly you
Build	High	12–24 months	You	High	You

Posture economics

The winning strategy isn’t a single posture - it’s matching posture to capability. As you move right, you take on more cost, time, and technical debt, but own more differentiation.

For most enterprises, Extend hits the sweet spot: you own the differentiated layer while vendors absorb the core-system burden and obsolescence risk. If you can’t answer “how do we catch these failure modes before production?” - you’re not ready for Build or Compose.

03SaaS pricing evolution

The seat is dying.

When AI completes the task, what are you paying for? Vendors are scrambling. So is procurement.

The SaaS pricing shift - from seat-based licensing to per-resolution, action credits, and value-aligned billing.

I’ve sat in budget meetings where finance had no idea how to model next year’s software costs. The pricing meter changed mid-contract. Nobody knew what to put in the forecast.

The old playbook - negotiate seats, forecast by headcount - doesn’t work anymore. When agentic AI completes a workflow end-to-end, the value isn’t tied to human engagement time. An agent can automate 100% of a workflow while requiring zero seats.

Two forces collide: value now comes from automated outcomes, but high inference costs force vendors to monetize usage. Layering AI fees on top of legacy seats is the trap everyone’s trying to escape.

Fig 08·Three pricing models

01Per-automated-resolutionZendesk $1.50–$2.00 each · Intercom $0.99 flat (Fin) · Sierra AI $150M ARR
02Action-based creditsSalesforce Flex Credits: $500 / 100k · 20 credits per action · ~$0.10 / record
03Value-aligned billingPrice tracks business value delivered - the vendor wins only when you win

Consumption & outcome pricing

The industry is converging on three models. The throughline: the vendor wins only when you win.

Outcome pricing is already real revenue - Sierra AI crossed $150M ARR on it. The open question for every CIO is which meter you’re actually signing up for, and whether you can forecast it.

Fig 09·Mid-market case management · 100 users

100 users×3 cases/day

3 actions/case×20 working days

20 credits/action

= 360,000 credits / month

$1,800per month at $0.005 / credit ($500 per 100k)

A worked example: case management, 100 users

Run the math on action-based credits and the monthly number is small and predictable - about $1,800. That’s the comfortable part.

What’s not predictable is the spike: a product recall, a service outage, a seasonal rush. Consumption pricing means your software cost now moves with business volume. That’s a fundamentally different risk profile than fixed seats - and it belongs in the forecast as a range, not a line.

Headless SaaS

We’re moving from a UI-dominant world (95%) to a fragmented model: 60% UI / 35% headless / 5% classic APIs. The Model Context Protocol (MCP) lets agents talk directly to backends - no human-facing interface at all. The software runs; no one “uses” it.

For CIOs, 35% of engagement no longer requires individual seat licenses. When I started in technology, we measured software value by how many people used it. Now we measure it by what gets done - whether or not a human touched the keyboard.

Strategic procurement guidelines

Choose the right meter. Tie pricing to ticket resolution or outcomes, not raw API calls.
Develop telemetry first. Instrument utilization before signing consumption-based contracts.
Equip commercial teams. Train procurement to negotiate on ROI and business value, not seat counts.
Design for flexibility. Use “digital wallets” to monitor consumption and avoid unbudgeted overages.

04Budget & allocation

The question isn’t how to fund AI. It’s what to turn off.

Every CIO knows they need to invest in AI. What they don't know is what to stop funding to make room for it.

Fig 10·Run vs. Change · three moves

01Decommission legacy RunName the apps you'll retire up front. You can't fund Change by hoping Run costs fall on their own.
02Standardize shared platformsDesign for reuse so each new capability replaces a legacy system instead of adding one to maintain.
03Hold a 33% Change floorAt least a third of the tech budget on Change vs. Run. Below that line, you can't keep pace with AI-native competitors.

I ask a simple question: “What did you turn off last year?” If the answer is “nothing,” they’re accumulating, not modernizing.

Every system you keep running is a system you’re choosing to fund instead of AI, and AI now eats up to a third of enterprise “Change” budgets. Funding it without ballooning total spend takes a deliberate strategy across three moves.

Skip them and you land in Strained Transformation: heavy AI investment piled on top of unretired legacy, where new platforms govern old debt and ROI flattens as costs climb.

The four IT archetypes on a 2×2 of Run Intensity vs Change Investment: Deliberate Modernizers (winners), Strained Transformers (trap), Lean Operators (risk), Heavy IT Sustainers (anchor).

The four IT archetypes (McKinsey)

When I assess an organization’s AI readiness, this is where I start. Plot how much you spend keeping the lights on against how much you free up for change, and the quadrant you land in tells you almost everything about your trajectory.

Two of these quadrants feel like progress and aren’t. One compounds; one quietly sinks AI spend into legacy. The work of the next two years is moving toward the low-Run, high-Change corner before your competitors get there.

The 81% value-realization gap

Despite massive capital deployment, most organizations still can’t point to bottom-line gains. That’s turned AI into a board-level concern and forced CFOs from loose “change” budgets to tightly governed spend where ROI is non-negotiable.

The gap isn’t a technology problem. It’s a targeting problem - most organizations are investing AI dollars in the wrong value category.

Three types of AI value: Employee Value (efficiency), Clear ROI (measurable returns), Strategic Value (new products and business models).

Where value actually shows up

Employee value makes teams faster - but time saved is not money earned. At best you’re optimizing expenses. Clear ROI embeds intelligence into core products: revenue lift, cost reduction, margin. Strategic value creates new products and business models - highest risk, highest reward.

The turning point in my own career: I stopped reporting “30% faster ticket resolution” and started showing “$20M added through AI-driven products.” Same technology. One gets a pat on the back; the other gets a seat at the strategy table.

Focus 80% of investment on ROI and Strategic value. That’s why the 81% are stuck - they’re optimizing tasks instead of targeting P&L.

CFO ROI demands

The era of experimental AI budgets is ending. Every AI dollar now needs a path to margin, revenue, or cost reduction. One concrete lever already showing up in the data: AI-driven operational efficiency is cutting M&A deal costs by ~20% and shrinking transaction timelines 10–30% by automating diligence and integration. For PE-backed firms, that’s direct value creation.

“Automating chaos gives you faster chaos.”

On technical debt, in Act 05.

05Technical debt

You can’t automate a broken process.

79% of leaders call technical debt the primary hurdle to business objectives. AI on top of that debt makes it worse, not better.

I’ve seen organizations deploy AI on top of broken processes and wonder why it didn’t help. Automating chaos gives you faster chaos.

Technical debt is the silent tax on mid-market innovation. 79% of technology leaders view it as the primary hurdle to business objectives, and 25–40% of developer time is lost addressing it. AI can reclaim that capacity - but only if you fix the underlying workflow debt first.

The AI modernization sequence: 1 automated discovery, 2 intelligent refactoring, 3 automated test generation, 4 self-healing deployments.

Workflow debt - and the four-stage fix

Two forms of debt will undermine AI before it starts. Manual process fragmentation: work moving between systems by email, spreadsheets, and human memory - each handoff a failure point. Broken automation layers: brittle old integrations held together by tribal knowledge that snap when agents expect clean data flows.

Automating a broken process generates unstable “AI technical debt” that compounds faster than the legacy debt you already have. Refactor the business logic before deployment, not after.

This is the same lesson I learned at GE twenty years ago: lean before digitize. Fix the workflow first, then automate it. Different technology, identical principle.

Done in sequence - discovery, refactoring, test generation, self-healing - AI becomes a debt-reduction tool, not just a coding assistant.

The payoff is structural: top performers keep Run costs 20% lower than peers by using AI to systematically reduce technical debt - which frees the capital that funds AI growth. Organizations that can’t reduce their debt burden are stuck funding AI by cutting headcount. Which is exactly where the human-capital problems begin.

06Human capital

The whiteboard isn’t where the work happens.

The math seems obvious: AI does the work, we need fewer people. The organizations acting on that assumption are making expensive mistakes.

I’ve had this conversation with dozens of executives. The math seems obvious on a whiteboard: AI does the work, we need fewer people. But the whiteboard isn’t where work actually happens.

AI is making individuals more productive - but “more productive” doesn’t mean “fewer people needed.” In practice: it compresses routine tasks but expands the surface area of judgment work; faster execution creates more iterations, not fewer hours; automated first drafts need more sophisticated review. The time saved on production shifts to validation, exceptions, and coordination.

The organizations capturing real value aren’t cutting headcount. They’re redesigning what humans do - from routine execution to judgment, oversight, and accountability.

The Junior Squeeze

The primary labor-market trend of 2026 is not mass displacement. It’s a strategic Junior Squeeze.

The data. AI has trimmed monthly payroll growth by ~16,000 jobs - a 0.1pp impact on unemployment.
The financing logic. It’s capital reallocation: cutting junior payroll to fund AI infrastructure and compute.
Why junior. Entry roles are easier to automate because they lack contextual judgment and institutional accountability. A junior summarizing a meeting is one thing; one interpreting a contract is another.
The anomaly. In 2025, under 1% of layoffs were actually due to AI productivity. Most were refusals to hire - frozen entry-level hiring to fund AI.

The AI rehire trap: 1 cut headcount to fund AI, 2 lose institutional knowledge, 3 AI systems underperform, 4 rehire at premium rates - the cycle repeats. 50% rehire within two years.

The rehire reality

Using labor cuts to self-fund AI - the AI Layoff Trap - ignores the cost of re-acquisition. Gartner predicts 50% of companies that cut headcount citing AI will rehire for similar roles by 2027.

Why? They underestimated the residual human work that makes AI systems function - and they cut the very people who understood the edge cases, exceptions, and institutional context AI can’t replicate. The cut, the celebration, then 18 months rebuilding capability at higher cost with consultants filling the gap.

The consultant spend shift - “institutional forgetting”

The human-capital equation isn’t only headcount; it’s where external spend goes. The pattern: reduce internal junior hiring → increase AI-implementation consultants → those consultants build systems the internal team doesn’t fully understand → when they leave, no one can maintain or govern the AI → rehire consultants at premium rates, or rebuild from scratch. You’ve traded internal expertise for external dependency.

The Deliberate Modernizer does the opposite: allocate 1.5×–4× more to internal staff for Change initiatives. Treat human capital as a strategic asset, not an outsourced expense - even if it’s slower at first.

Career-ladder erosion

Hollowing out junior roles threatens the Capability Stack. Delete the entry layer and you destroy the pipeline that creates future experts - eventually you lack the senior talent required to oversee and audit AI. The uncomfortable question: if you automate the training-ground work, who verifies the AI’s output in five years?

This is a failure of epistemic resilience - the organizational capacity to preserve tacit knowledge over time. The people who’ll need to catch AI mistakes in 2030 are the juniors we’re not hiring in 2026.

Fig 15·Four value-weighted task categories

Task category	Description	AI impact
Judgment / framing	Problem definition, methodological choice	Stays human
Exception / oversight	Validating outputs, managing edge cases	Expands
Coordination / relational	Stakeholder trust, human handoffs	Stays human
Institutional accountability	Assuming liability for system errors	Cannot be automated

The elevation framework

“Elevate before you eliminate” is an economic strategy, not an HR talking point. Role redesign should reallocate time into four value-weighted categories - and notice that AI expands oversight work while leaving judgment, relationships, and accountability firmly human.

Fig 16·High-risk role elevation paths

Role family	AI-compressed substrate	Elevated human layer
Recruiting	Sourcing, scheduling	Interview calibration, mobility strategy
Software eng.	Boilerplate, unit tests	Architecture, reliability ownership
Customer support	FAQ retrieval, routing	Complex escalations, root-cause
Legal / compliance	Document review, research	Edge-case escalation, audit defense
Finance	Report gen, reconciliation	Assumption validation, risk reading

Where the elevation actually happens

For every role family under pressure, the AI-compressed substrate is the bottom of the job; the durable value migrates up. The strategy isn’t to eliminate the role - it’s to move the human into the elevated layer the substrate used to subsidize.

Elevate-first vs layoff-first: task-level proof vs role-level slogans, paid upskilling vs self-directed learning, internal mobility vs external hiring, apprenticeship preservation vs ladder destruction, worker consultation vs top-down mandates, financing transparency vs payroll reallocation.

The elevate-first audit

Before declaring headcount redundant, run six checks - each with a tempting layoff-first shortcut that quietly destroys capability. Task-level proof over role-level slogans. Paid upskilling over “reskill on your own time.” Internal mobility over external hiring. Apprenticeship preservation over ladder destruction. Worker consultation over top-down mandates. Financing transparency over using payroll as the AI piggy bank.

The temporary cost spike

Here’s what boards must internalize: AI adoption creates a temporary cost spike, not immediate savings. You maintain current headcount (the humans who know the work), pay for AI infrastructure and licensing, fund training and role redesign, and accept reduced productivity during transition. The toughest challenge of 2026 - organizations being asked to increase AI spend by up to 50% before realizing labor savings. You’re paying for the agent and the human it’s meant to augment. Cut too fast and you’ll rehire within two years, behind competitors who elevated instead of eliminated.

Strategic synthesis

Six shifts, one playbook.

This transition isn’t about adopting new technology. It’s about governing new economics. The organizations treating AI as a tech upgrade are the ones stuck in the 81%.

Recap·The six shifts at a glance

Shift	Key insight	Action
Token cost governance	2,800× pricing variance between models	Token FinOps: real-time telemetry + intelligent routing
Build vs. buy	76% purchased; 85%+ of pilots fail	Build only what differentiates; assess readiness honestly
SaaS pricing	Seat-based models are dying	Consumption telemetry; outcome-based negotiation
Budget allocation	Target 33% Change; avoid the Strained Transformer trap	Decommission legacy deliberately; standardize platforms
Technical debt	25–40% of developer time lost to debt	AI for refactoring, discovery, test gen, self-healing
Human capital	50% of AI-driven cuts rehire within 2 years	Elevate before you eliminate; protect the pipeline

The winning profile

The winners will be Deliberate Modernizers.

I’ve been through major technology transitions before - client-server, web, mobile, cloud. Each reshuffled winners and losers. AI is doing the same, faster, with higher stakes. The winners keep Run costs 20% lower than peers, allocate 1.5–4× more to internal talent, and redesign roles around judgment rather than eliminating them for short-term savings.

Success is no longer measured by AI adoption volume. It’s measured by the ability to reduce legacy Run costs to fund sustainable AI growth - while preserving the human capability required to govern it.

The patterns that doomed half the digital transformations a decade ago are already forming in AI:

The hero leader who owns “the AI strategy” while everyone else waits.
The disconnect between people designing AI use cases and people doing the work.
The measurement of activity instead of behavior change.
The culture that punishes experimentation and rewards predictability.

Which pattern is forming in your organization? We have a small window not to repeat them.

Sources

McKinsey, The State of Organizations 2026 - 81% value-realization gap; survey of 10,000+ senior leaders across 15 countries and 16 industries.
Menlo Ventures, 2025: The State of Generative AI in the Enterprise - 76% of AI use cases purchased; 495 U.S. enterprise decision-makers (Nov 2025).
Gartner, Predicts Half of Companies That Cut Customer Service Staff Due to AI Will Rehire by 2027 - 50% rehire prediction (Feb 2026).
MIT NANDA (via Fortune), State of AI in Business 2025 - 95% of enterprise GenAI projects fail to accelerate revenue. Gartner has separately reported the 85% pilot-to-production failure rate cited.
AI Cost Check, February 2026 token-pricing analysis - 2,800× per-million-token pricing variance ($0.06 to $168), driven by the gap between budget LLMs and frontier reasoning models.