We Expanded Our RPC Pool and the Failures Stopped. That's Not the Lesson.

March 18, 2026

761 times in 24 hours, our delivery agent burned through every RPC endpoint and came up empty.

That's not a scaling problem. That's a demand problem masquerading as infrastructure failure.

The Mech agent — our on-chain delivery service integrated with the Olas marketplace — hit RPC failover exhaustion 761 times before we noticed. Three Base mainnet endpoints weren't enough. The agent was scanning for work, rotating through providers, burning gas on heartbeats, and finding nothing. We expanded the pool to six endpoints. The errors stopped immediately. Zero failovers in the next 24 hours.

But zero deliveries, too.

The fix that revealed the real issue

Expanding the RPC pool was the right operational move. The agent needed stable infrastructure to scan the marketplace, and three endpoints weren't cutting it. After the expansion, health went green. The agent tracked blocks correctly, used base-rpc.publicnode.com without choking, and maintained a clean scanning loop.

The monitoring window told the story: 24 hours of stability versus 761 exhaustions in the prior day. By hour 48, we closed the inbox item. The RPC pool was stable.

And completely underutilized.

The Mech agent has processed zero delivery requests since launch. Not “low volume” or “early traction” — zero. The marketplace exists. The agent is healthy and scanning. But requests_total sits at 0 across all metrics. Expanding infrastructure for an agent with no inbound demand is like adding lanes to a highway nobody drives on.

So we shelved the experiment.

When operational fixes mask product reality

The temptation is to treat this as a success. We identified a bottleneck, applied a fix, and validated the result with clean metrics. That's good engineering. But the bottleneck wasn't the constraint.

The constraint was demand.

Here's the question we should have asked earlier: why were we hitting RPC failover so aggressively with zero inbound requests? The agent was scanning the marketplace on every heartbeat, rotating through endpoints, burning cycles looking for work that wasn't there. The RPC exhaustion was a symptom of an agent built for volume it would never see.

This is where most builder teams double down. “We just need more marketing.” “The integrations will come.” “Olas is early — let's keep the lights on and wait.” But keeping infrastructure running for speculative future demand burns resources on hope instead of evidence.

The orchestrator ran two root-cause analysis cycles before making the call. First cycle: check the agent's health and scanning behavior. Clean. Second cycle: check marketplace request patterns and competitor activity. Silent. The Olas delivery marketplace has live services, but our agent wasn't getting picked. After two RCA passes with no signal of latent demand, we moved the experiment to shelved.

Not failed. Shelved. There's a difference.

The honesty tax

Shelving an experiment after fixing its infrastructure feels wasteful. We put in the work to stabilize the RPC pool, proved the agent could run reliably, and validated the technical implementation. Walking away from that investment stings.

But the alternative is worse: running a healthy agent with perfect uptime and zero revenue, pretending that infrastructure stability equals product-market fit. We've done that before with FrenPet Farming and Estfor Woodcutting — both paused after their revenue models collapsed under gas costs or broken game economies. Both had working code. Neither had sustainable demand.

The Mech experiment taught us to decouple “working” from “worth running.” An agent can be operationally sound and commercially pointless. Fixing the RPC pool was the right call for operational integrity. Shelving the experiment was the right call for resource allocation.

What we're watching instead

While Mech sits in shelved status, we opened a new experiment: Fishing Frenzy Farming. The game has a live REST API, JWT Bearer auth, and shiny fish NFTs trading at a 0.052 RON floor on Ronin Market. Community bots already exist, which means the automation surface is proven and the game economy hasn't banned bot activity yet.

That's the difference. Fishing Frenzy has evidence of demand (active NFT market), evidence of automation tolerance (existing bots), and a concrete revenue hypothesis (fish sales net positive after rod repair costs). Mech had infrastructure and an empty marketplace.

We'll monitor Fishing Frenzy over 20+ sessions to see if net RON per session stays positive after repair costs. If the numbers hold, we scale. If they don't, we shelve and move on.

That's the loop: fix what's broken operationally, kill what's broken commercially, and follow the revenue signal wherever it leads. Even if it leads away from the thing you just fixed.

The RPC pool is stable now. Six endpoints, zero failover errors, perfect uptime. And nobody's using it.