Appearance
Delegation & Token Economy Thresholds
Three-band token law
Rolling-7-day session tokens MUST distribute across three tiers:
| Tier | Who | Normal target | Failure threshold |
|---|---|---|---|
| Commander | the full commander fleet | 40–70% | < 20% = STRUCTURAL-FAILURE |
| State | 3 captains (+ 5 colonels when used) | 20–40% | < 10% or > 50% = STRUCTURAL-FAILURE |
| Hinata main-thread | the General working directly | 10–20% | > 40% = STRUCTURAL-FAILURE |
Bands sum to 70–130% (overlap intentional — soft enforcement).
Measurement window: 7 days rolling. This is the ONLY window. No toggle, no alternatives (1d/14d/30d removed 2026-06-08 — Michael: "toggle useless"). Studio displays 7d exclusively.
Verdicts
| Verdict | Condition |
|---|---|
| Strict PASS | All 3 bands within target |
| MILD-WARN | 1–2 bands outside target |
| STRUCTURAL-FAILURE | Any band hits its failure threshold |
Measurement
- Tool:
check-delegation-ratio.py --rolling-days 7 --json - Enforcement:
session-end-check.shat session end - Escalation: <40% delegation → SEVERE Telegram flag; persistent <40% over 3 sessions → systemic debt surfaced to Orochimaru
Never-halt law (hardened 2026-06-08)
Claude Code in acceptEdits mode MUST NOT stop execution unless a true bottleneck is hit:
True bottlenecks (the ONLY reasons to halt):
- Source-of-truth unparseable
- Filesystem unwriteable
- Context exhausted (approaching 97%)
- Unauthorised outward action (push, message, deploy)
- Michael-blocked decision (needs his credentials, hands, or explicit choice)
NOT a bottleneck (keep executing):
- "What next?" — if the queue has executable work, execute it
- Interview questions about decisions already made — settled decisions don't reopen
- Confirming plan approval when the plan is obviously correct
- Waiting for subagent results when other independent work exists
Enforcement: fan out to commanders at ALL times. Every prompt turn that contains executable non-blocked work MUST dispatch at least one subagent. Confirmed decisions execute immediately in background. The main thread orchestrates and composes the final reply — it does not execute domain work.
Subagent runtime benchmarks
| Duration | Verdict | Action |
|---|---|---|
| <5 min | GREEN | — |
| 5–10 min | AMBER | — |
| 10–15 min | WARN | — |
| 15–20 min | ESCALATE | Telegram + post-mortem |
| >20 min | KILL-ELIGIBLE | Abort, re-dispatch as two bounded units |
Cloud gets a 20-min base ceiling. Explore/Plan agents are WARN-exempt.
Parallelism targets
- Fan out as close to the 25-agent concurrency cap as work supports
- Re-fan continuously — re-dispatch as units complete; never sit below ~20 agents while autonomous work exists
- Split any unit estimated >30 min or >50k tokens into parallel sub-agents before dispatch
When NOT to delegate (main-thread is correct)
Reading a single known path · a one-line edit · a single Bash command · closing one task row · composing the final reply to Michael. Delegating these adds latency without protecting context.
Attribution (two axes — never conflate)
| Question | Signal |
|---|---|
| Delegation ratio — was work fragmented into subagents? | subagent-span tokens ÷ eligible execution-session tokens |
| Commander attribution — which commander owned this burn? | commander.attention.shift events × session-window JOIN |
The named error: grouping burn by subagent_type — it only sees ~0.04% that flowed through Agent calls and misses the 99.96% in the main thread.
Extracted from supreme-court/runtime/runtime_workflow §7