Treating enterprise AI as an operating layer
There’s a fault line running through enterprise AI, and it’s not the one getting the most attention. The public conversation still tracks foundation models and benchmarks — GPT versus Gemini, reasoning scores, and marginal capability gains. But in practice, the more durable advantage is structural: who owns the operating layer where intelligence is applied, governed, and improved. One model treats AI as an on-demand utility; the other embeds it as an operating layer—the combination of workflow software, data capture, feedback loops and governance that sits between models and real work— that compounds with use.

Model providers like OpenAI and Anthropic sell intelligence as a service: you have a problem, you call an API, you get an answer. That intelligence is general-purpose, largely stateless, and only loosely connected to the day-to-day workflow where decisions are made. It’s highly capable and increasingly interchangeable. The distinction that matters is whether intelligence resets on every prompt or accumulates over time.
Incumbent organizations, by contrast, can treat AI as an operating layer: instrumentation across workflows, feedback loops from human decisions, and governance that turns individual tasks into reusable policy. In that setup, every exception, correction, and approval becomes a chance to learn—and intelligence can improve as the platform absorbs more of the organization’s work. The organizations most likely to shape the enterprise AI era are those that can embed intelligence directly into operational platforms and instrument those platforms so work generates usable signals.
The prevailing narrative says nimble startups will out-innovate incumbents by building AI-native from scratch. If AI is primarily a model problem, that story holds. But in many enterprise domains, AI is a systems problem — integrations, permissions, evaluation, and change management — where advantage accrues to whomever already sits inside high-volume, high-stakes workflows and converts that position into learning and automation.
The inversion: AI executes, humans adjudicate
Traditional services organizations are built on a simple architecture: humans use software to do expert work. Operators log into systems, navigate workflows, make decisions, and process cases. Technology is the medium. Human judgment is the product.
An AI-native platform inverts this. It ingests a problem, applies accumulated domain knowledge, executes autonomously what it can with high confidence, and routes targeted sub-tasks to human experts when the situation demands judgment that the system can’t yet reliably provide.
But inverting human-AI interaction isn’t just a UI redesign — it requires raw material. It’s only possible when the platform is built on a foundation of domain expertise, behavioral data, and operational knowledge accumulated over years.
The three compounding assets incumbents already own
AI-native startups begin with a clean architectural slate and can move quickly. What they can’t easily manufacture is the raw material that makes domain AI defensible at scale:
- Proprietary operational data
- A large workforce of domain experts whose day-to-day decisions generate training signals
- Accumulated tacit knowledge about how complex work actually gets done
Services companies already have all three. But these ingredients aren’t moats on their own. They become an advantage only when a company can systematically convert messy operations into AI-ready signals and institutional knowledge — then feed the results back into the workflow so the system keeps improving.
Codifying expertise into reusable signals
In most services organizations, expertise is tacit and perishable. The best operators know things they cannot easily articulate: heuristics developed over the years, edge-case intuitions, and pattern recognition that operate below the level of conscious reasoning.
At Ensemble, the strategy for addressing this challenge is knowledge distillation. The systematic conversion of expert judgment and operational decisions into machine-readable training signals.
In health-care revenue cycle management, for example, systems can be seeded with explicit domain knowledge and then deepen their coverage through structured daily interaction with operators. In Ensemble’s implementation, the system identifies gaps, formulates targeted questions, and cross-checks answers across multiple experts to capture both consensus and edge-case nuance. It then synthesizes these inputs into a living knowledge base that reflects the situational reasoning behind expert-level performance.
Turning decisions into a learning flywheel
Once a system is constrained enough to be trusted, the next question is how it gets better without waiting for annual model upgrades. Every time a skilled operator makes a decision, they generate more than a completed task. They generate a potential labeled example—context paired with an expert action (and sometimes an outcome). At scale, across thousands of operators and millions of decisions, that stream can power supervised learning, evaluation, and targeted forms of reinforcement—teaching systems to behave more like experts in real conditions.
For example, if an organization processes 50,000 cases a week and captures just three high-quality decision points per case, that’s 150,000 labeled examples every week without creating a separate data-collection program.
A more advanced human-in-the-loop design places experts inside the decision process, so systems learn not just what the right answer was, but how ambiguity gets resolved. Practically, humans intervene at branch points—selecting from AI-generated options, correcting assumptions, and redirecting the workflow. Each intervention becomes a high-value training signal. When the platform detects an edge case or a deviation from the expected process, it can prompt for a brief, structured rationale, capturing decision factors without requiring lengthy free-form reasoning logs.
Building toward expertise amplification
The goal is to permanently embed the accumulated expertise of thousands of domain experts—their knowledge, decisions, and reasoning—into an AI platform that amplifies what every operator can accomplish. Done well, this produces a quality of execution that neither humans nor AI achieve independently: higher consistency, improved throughput, and measurable operational gains. Operators can focus on more consequential work, supported by an AI that has already completed the analytical groundwork across thousands of analogous prior cases.
The broader implication for enterprise leaders is straightforward. Advantages in AI won’t be determined by access to general-purpose models alone. It will come from an organization’s ability to capture, refine, and compound what it knows, its data, decisions, and operational judgment, while building the controls required for high-stakes environments. As AI shifts from experimentation to infrastructure, the most durable edge may belong to the companies that understand the work well enough to instrument it and can turn that understanding into systems that improve with use.
This content was produced by Ensemble. It was not written by MIT Technology Review’s editorial staff.