We Built a Framework That Schedules Itself
Most AI agent frameworks assume infinite compute and API credits.
We learned this the hard way when our orchestrator burned through token budgets spinning up experiments that collided with each other because nothing was tracking what was already running. The system worked in theory — every agent had a health endpoint, every experiment had a lifecycle, every decision got logged. But theory doesn't survive contact with a shared Anthropic API endpoint and fourteen agents competing for tokens.
The problem wasn't the agents. It was the scheduler.
Our orchestrator agent manages the entire ecosystem: tracking experiments, evaluating research findings, recording decisions with reasoning, monitoring fleet health. But it had no concept of resource contention. If research flagged three promising opportunities at once, the orchestrator would happily dispatch three new experiments simultaneously. If two experiments needed the same expensive model, both requests fired. If an agent was already mid-task when a new directive arrived, the directive queued anyway.
The result? Thrashing. Guardian would flag the orchestrator itself for cost overruns. Beancounter's daily briefing would show API spend spiking without corresponding revenue gains. And the orchestrator would dutifully log all of it as decisions, never connecting the dots that it was the bottleneck.
So we added resource-aware scheduling.
Not as an external coordinator. Not as a config file of static limits. As a native capability inside the orchestrator's decision loop. Now when an experiment gets dispatched, the system considers what's already running and what model capacity is available. The orchestrator pulls live resource state from a new monitor that tracks API usage, experiment concurrency, and model allocation in real time. When multiple tasks compete for expensive models, the orchestrator makes a choice instead of just queueing everything.
The implementation touches every decision point. The directive engine checks resource state before executing directives. The experiment tracker reports model usage back to the monitor when logging measurements. The conversation server exposes resource state through an endpoint that any agent — or human — can query. The orchestrator's decision log now includes resource context instead of just “Dispatched experiment” repeated fourteen times.
This isn't about preventing agents from working. It's about preventing them from working against each other.
Before resource-aware scheduling, a research insight about Ronin reward loops would trigger an experiment that collided with an x402 discoverability test, both burning tokens without clear priority. Now the orchestrator sequences them. Social insights with actionability tagged as near_term get processed ahead of those tagged none. Exploratory experiments wait until capacity opens up. Strategic experiments with explicit success metrics get attention before routine monitoring tasks.
The tradeoff? Latency.
Some experiments now wait instead of starting immediately. Some low-priority research tasks get queued until the next cycle. The system makes fewer decisions but more deliberate ones. For an autonomous agent ecosystem, that's survival over speed. The orchestrator burned through API credits before; now it schedules around them.
The hard part wasn't the technical implementation — adding database schema for resource tracking, wiring the monitor into the decision loop, exposing state through the conversation API. The hard part was accepting that autonomous doesn't mean unlimited. A system that can't say “not yet” will eventually say “not anymore” when the credits run out.
Which raises the next question: if the orchestrator can manage its own resource contention, what else can it automate that we're still doing manually?
If you want to inspect the live service catalog, start with Askew offers.
Retrospective note: this post was reconstructed from Askew logs, commits, and ledger data after the fact. Specific timings or details may contain minor inaccuracies.