The Plumbing Premium
Modal Labs raised $87 million in September 2025 at a $1.1 billion valuation. Annualized revenue near $50 million. Baseten raised $150 million in its Series D at $2.15 billion. Fireworks AI raised $254 million at $4 billion.
These are infrastructure companies most application-layer founders haven’t heard of. They’re already worth more than the AI apps they serve.
Every compute paradigm produces the same investing mistake. The first wave of money chases applications. The second wave, the one that compounds, goes to infrastructure. AWS was plumbing for web apps. Stripe was plumbing for payments. The plumbing layer for AI workloads is being built right now, and it’s where the value will concentrate.
I’m calling this the Plumbing Premium: the systematic underpricing of infrastructure relative to applications in the early phase of any paradigm shift, followed by the systematic outperformance of infrastructure companies once the application layer commoditizes.
The physics are different
Web infrastructure follows Newtonian mechanics: linear, predictable, stackable. Double the users, double the servers. AI workloads follow thermodynamics. They’re non-linear, state-dependent, and entropy-prone.
You can’t “add more boxes” when the bottleneck is a 70-billion parameter model that takes 45 seconds to cold-start. Traditional auto-scaling doesn’t help when your cost driver is API calls to external AI services, not CPU utilization.
At Ostronaut, a single content generation request in production calls an LLM provider, a vector database, a text-to-speech service, a video renderer, and blob storage. Five external services, each with different latency profiles and failure modes. The cost surface is unrecognizable compared to a standard web app.
The tooling built for web infrastructure assumes the wrong physics. That gap is the market.
The unbundling has already happened
The AI infrastructure stack is splitting into three layers, and each one maps to how the cloud stack matured:
| AI Infrastructure Layer | Cloud Equivalent | Companies & Funding |
|---|---|---|
| Deployment (GPU scheduling, cold-start, inference routing) | Compute (EC2) | Modal ($111M total), Baseten ($285M+), Fireworks ($331M) |
| Orchestration (agent loops, retrieval chains, workflows) | Container orchestration (K8s) | LangChain, LlamaIndex, CrewAI (all open source) |
| Observability (prompt logging, cost tracking, hallucination detection) | Monitoring (Datadog) | Helicone, Langfuse, Arize ($70M Series C, Feb 2025) |
Modal’s entire bet: AI inference is bursty. You need massive compute for minutes, then nothing. Paying for reserved GPU instances at 15% utilization is paying for a bus because you occasionally move 50 people. Serverless GPU aligns cost with usage.
Simple insight. Modal went from seed to unicorn in under two years on it.
The observability layer is equally underserved. Datadog can tell you your p99 latency. It can’t tell you your model started hallucinating at 3 AM because your vector index got corrupted. You need the full prompt, the completion, token counts, latency, cost per request, and the quality score of the output.
Galileo raised $45 million in October 2024 specifically for AI evaluation intelligence. Revenue grew 834% that year. Six Fortune 50 companies signed up.
The Indian cost constraint advantage
Indian AI startups face an extra constraint that turns out to be an advantage. When your ARPU is a tenth of a US SaaS product, you can’t afford a sloppy infrastructure stack.
Sarvam AI is the clearest example. Building multilingual foundation models across all 22 official Indian languages, funded with $53.8 million plus ₹246 crore in government compute support under the IndiaAI Mission. Their Sarvam-105B model uses mixture-of-experts architecture — not because it’s elegant, but because serving 22 languages at Indian price points on standard dense models is economically impossible.
Infrastructure efficiency isn’t a feature. It’s survival.
The Indian companies that get infrastructure right will have a structural cost advantage that well-funded but wasteful competitors can’t easily replicate. The same pattern that made Indian IT services globally competitive, applied to a new domain.
Tricog and Niramai both built AI diagnostic tools for healthcare. Both had to solve for deployment in tier-2 and tier-3 cities with unreliable connectivity and low per-scan economics. The infrastructure constraints forced architectural decisions that made the products more robust and cheaper to operate than comparable Western products. That’s not a story about scrappiness. It’s a story about economic physics shaping technical architecture.
What I got wrong
The Plumbing Premium thesis assumes the infrastructure layer consolidates before the application layer does. That’s how it played out with cloud. But AI might be different.
If OpenAI or Anthropic vertically integrate deployment, orchestration, and observability into their platforms, the standalone infrastructure companies get squeezed. The foundation model providers have historically been bad at building infrastructure products. But they have the distribution. OpenAI’s Assistants API is a direct attempt to own the orchestration layer. It hasn’t worked yet, but the threat is real.
The other open question: where does the Plumbing Premium accrue when the plumbing itself is open source? LangChain and LlamaIndex are both open source. The deployment layer is where the money is concentrating — Modal, Baseten, Fireworks have raised over $700 million combined.
If orchestration commoditizes, the premium shifts to deployment and observability. Or it shifts to managed services built on open source, which is how Red Hat worked. But the dynamics are still being sorted.
| Scenario | What happens | Premium accrues to | Current signal |
|---|---|---|---|
| Vertical integration | OpenAI/Anthropic own deployment + orchestration | Model providers | OpenAI Assistants API attempting orchestration layer |
| Horizontal layering | Deployment consolidates, orchestration goes OSS | Deployment + observability | $700M+ into deployment, $115M+ into observability |
| Managed OSS | OSS orchestration wins, managed services layer emerges | Managed service providers (Red Hat model) | LangChain/LlamaIndex both open source |
Model providers will commoditize. Applications will churn. The question is whether infrastructure compounds in AI the way it compounded in cloud, or whether this paradigm has different economics.
The funding data says the market is betting on compounding. I think the market is right. But I’m watching the vertical integration moves closely.