How to Measure AI ROI: A 5-Layer Framework for Hong Kong Enterprise Leaders

A working framework for measuring AI return on investment with the rigour a CFO applies to capital allocation. Built for Hong Kong enterprise leaders.

Insight

2026-06-08

Gartner's 2026 research found that only 28% of enterprise AI use cases fully meet ROI expectations, while 20% fail outright. The reason is rarely the technology. The reason, in almost every audit, is that the organisation never built a measurement framework before the project started. It measured activity, not impact.

This article gives Hong Kong enterprise leaders a working framework for measuring AI return on investment with the same rigour the CFO applies to any other capital allocation decision. By the end, you will know which five layers of value any serious AI programme must track, which KPIs map to each layer, and where most measurement frameworks quietly fall apart.

Why Do Most Enterprise AI Investments Fail to Show ROI?

The dominant cause is measurement design, not technical performance. McKinsey's 2025 State of AI report found that 60% of organisations have not seen enterprise-wide EBIT impact from their AI programmes, and only 21% have fundamentally rebuilt workflows around AI. Activity-based metrics like "prompts per week" or "users onboarded" hide the absence of business outcomes.

For most Hong Kong enterprises, the symptom looks similar across industries. A logistics firm runs a six-month AI document-processing pilot and reports "85% accuracy" to the steering committee. A regional bank deploys an AI assistant for relationship managers and tracks "monthly active users." A professional services group fine-tunes a model on internal templates and reports "hours saved per consultant."

None of these are wrong. But none of them are what the CFO actually wants to see. The CFO wants to see a line item that connects to revenue, cost, or working capital. The measurement framework has to translate technical signals into that line item, and it has to do it before the project is approved, not after.

What Is the McKinsey Five-Layer AI Measurement Framework?

The McKinsey five-layer AI measurement framework structures every AI investment across five stacked layers, from infrastructure up to financial impact. Layer 5 is technical infrastructure. Layer 4 is model and solution performance. Layer 3 is user engagement and adoption. Layer 2 is workflow and operational outcomes. Layer 1 is bottom-line financial results.

The framework matters because each layer has different owners, different metrics, and different time horizons. Confusing them is the single most common reason AI programmes lose credibility with the board. When the head of digital transformation reports adoption numbers as if they were business impact, the CFO learns to discount everything that follows.

According to McKinsey's 2025 research, the group of AI adopters who fundamentally rebuilt workflows around the technology were 3.6 times more likely to report greater than 5% EBIT impact than those who layered AI onto existing processes. The five-layer framework forces this rebuild conversation upfront, because Layer 2 outcomes cannot improve unless the workflow itself changes.

How Do You Translate Technical Metrics Into P&L Impact?

Translation happens by mapping every layer's metric to the layer above it, ending at a financial line on the P&L. Model accuracy improves only if user adoption converts that accuracy into work. Adoption produces value only if workflow throughput or quality measurably changes. Workflow changes show on the P&L only if a cost, revenue, or capital line is directly affected.

A concrete example. A Hong Kong insurance underwriter deploys an AI document extraction model for policy onboarding. Layer 4 reports 92% extraction accuracy. Layer 3 reports 240 underwriters using the tool weekly. Layer 2 reports that average onboarding turnaround time dropped from 4.2 days to 2.7 days. Layer 1 reports that the operations cost per new policy declined by HK$84, and that the time saved enabled 14% more policies underwritten per quarter without headcount growth.

Only the Layer 1 number belongs in the board paper. The other layers exist to defend it under questioning. If the CFO asks how confident you are in the HK$84 figure, you walk back down through Layer 2, Layer 3, and Layer 4 to show the chain holds.

Which KPIs Should You Track at Each Layer?

Each layer demands a small, defended set of KPIs that the next layer up can use as evidence. Layer 5 tracks infrastructure uptime and inference latency. Layer 4 tracks model accuracy, hallucination rate, and prompt cost. Layer 3 tracks weekly active users, depth of use, and abandonment. Layer 2 tracks process throughput, error rate, and cycle time. Layer 1 tracks cost reduction, revenue uplift, and working capital improvement.

The discipline is restraint. Most enterprise AI dashboards collapse under their own weight, tracking 40 metrics nobody can act on. A defensible framework picks two or three metrics per layer and protects them. Gartner's 2026 guidance argues that organisations measuring AI through five tightly scoped business-outcome metrics are 2.4 times more likely to receive renewed board funding than those tracking 20 or more activity metrics.

The hard part is what is excluded. Anything not on the list is research-grade information for the technical team, not management reporting. Boards do not want dashboards. Boards want a number, a trend, and a defended explanation.

How Do You Handle the Cost Side of AI ROI?

Most AI ROI calculations fail because the cost side is dramatically understated. Gartner's 2026 AI value research found that 85% of organisations misestimate AI project costs by more than 10%, and the actual cost of a deployed AI system is typically two to three times the initial licensing or development estimate once data preparation, integration, change management, ongoing maintenance, and internal oversight are included.

A credible AI business case carries five cost categories. Direct technology costs include model API fees, infrastructure, and tooling. Data costs include sourcing, labelling, cleaning, and retention. Integration costs include engineering work to connect the AI system to ERP, CRM, or core systems. Change management costs include training, communications, and the time existing staff lose during rollout. Governance costs include policy work, monitoring, security review, and audit.

Hong Kong enterprises consistently underweight the last three. The Productivity Council noted in its 2026 AI adoption survey that 71% of mid-market firms in Hong Kong listed integration with legacy systems as the top barrier to scaling AI, which translates directly into integration costs that the original business case did not budget for.

What Should a Real AI Business Case Look Like in 2026?

A defensible AI business case in 2026 covers six elements in under fifteen pages. It states the business problem in P&L terms. It quantifies the baseline cost or revenue line being targeted. It defines a three-year fully-loaded total cost of ownership across all five cost categories. It commits to specific Layer 1 metrics with target ranges, not point estimates. It identifies the workflow redesign required for the impact to land. It names a single accountable owner.

The point estimate trap is worth flagging. Any business case that promises "21.4% reduction in operating cost" without a range is read by the CFO as either naive or padded. Mature business cases give a target band, for example "12% to 18% reduction in operating cost, with the midpoint reached by month nine if adoption exceeds 60% of eligible users." That formulation tells the board what to track and when to step in if the trajectory drifts.

The workflow redesign element is the one most often missed. McKinsey's data is unambiguous. Organisations that paste AI onto existing workflows see negligible EBIT impact. Organisations that redesign the workflow capture the value. Any business case that does not name the workflow being rebuilt is, by McKinsey's evidence, statistically unlikely to deliver the promised return.

What Are the Most Common Pitfalls in AI ROI Measurement?

The recurring pitfalls fall into four categories. The first is the time-saved fallacy, where hours saved per employee are multiplied by hourly cost without checking whether the saved time was actually reallocated to value-generating work. The second is the pilot-to-scale gap, where ROI calculated on a 40-user pilot is naively projected onto a 4,000-user rollout. The third is single-vendor lock-in costs that surface only in year two. The fourth is the absence of a baseline measurement before the AI system went live.

The baseline problem is the most damaging. If the organisation never measured average cycle time, error rate, or cost per transaction before the AI deployment, every post-deployment number is a claim without a comparison. Boards have learned to discount any AI impact figure that lacks a documented pre-deployment baseline.

For Hong Kong enterprises specifically, an additional pitfall is currency and exchange exposure on AI infrastructure billed in US dollars. A six-quarter AI programme priced in USD can see its HK-dollar cost line swing by 4% to 6% on currency movement alone, which often eats the projected margin improvement if the business case did not hedge or stress-test the assumption.

How Should Hong Kong Enterprises Apply This Framework?

Application starts with the smallest defensible scope. A Hong Kong enterprise should pick one workflow with a measurable Layer 1 line, document the baseline cost or cycle time for ninety days before any AI deployment, define the five-cost-category TCO with a sensitivity range, and commit to a target band with monthly reporting back to the steering committee. The first programme is the credibility builder for everything that follows.

The choice of first workflow matters. The strongest candidates are workflows with high volume, structured inputs, and an existing measurable error or cost line. Document processing in claims, KYC, and accounts payable typically clears that bar. Customer-service deflection clears it if the call-volume and cost-per-contact baseline already exist. Sales prospecting often does not clear it because the baseline is too noisy to defend.

We understand the cold edges of AI and the hard parts of your work, and UD has walked with Hong Kong enterprises for twenty-eight years, making technology a partnership with warmth. The measurement framework is how that partnership stays accountable to the numbers your board cares about.

Now that you have the framework, the next step is identifying the right entry point for your organisation. We'll walk you through every step, from AI readiness assessment to workflow selection, deployment, and quarterly ROI reporting that holds up to board scrutiny.

Book a Free AI Ready Check