Why every platform team is rebuilding around LLMs

RAG pipelines, agent runtimes, and eval harnesses are moving from side projects to core infrastructure. Here's what the shift looks like in production.

SKSaurabh KhanAI & DevOps Engineer · Jun 18, 2026 · 1 min read

For most of the last decade, "platform engineering" meant Kubernetes, CI/CD, and a paved road for shipping services. In 2026, a second paved road is being laid down next to it — one for LLM-powered features — and the teams who own it look a lot like the teams who owned containers five years ago.

The new primitives

A production LLM stack has settled into a recognizable shape:

Gateways that handle routing, retries, rate limits, and cost attribution across providers.
Retrieval pipelines (RAG) that keep a vector index fresh and observable.
Eval harnesses that score outputs continuously, not just at release.
Agent runtimes with tool sandboxes, timeouts, and audit logs.

None of these are research problems anymore. They're infrastructure, and infrastructure is what platform teams do.

Treat prompts like config

The biggest cultural shift is treating prompts and model choices as versioned configuration, not code buried in a service:

from observe import gateway

resp = gateway.chat(
    model="claude-opus-4-8",
    prompt_id="support/triage@v7",   # versioned, not inline
    inputs={"ticket": ticket_text},
    budget_usd=0.02,                 # hard ceiling, attributed to the team
)

Once prompts are versioned artifacts, you can roll them back, A/B them, and diff them — the same things you already do for everything else in production.

What to build first

If you're standing this up, start with the gateway and cost attribution. You can't manage what you can't see, and the first painful surprise is always the bill. Retrieval and evals come next, once you have traffic flowing through one observable choke point.

The teams getting this right aren't the ones with the fanciest models. They're the ones who made LLM calls boring, observable, and cheap to change.

#LLMs#Platform Engineering#RAG#AIOps

All articles

Why every platform team is rebuilding around LLMs

The new primitives

Treat prompts like config

What to build first

More in AI & ML

Evaluating RAG: the metrics that actually matter