Skip to content

Delegation & Token Economy Thresholds

Three-band token law

Rolling-7-day session tokens MUST distribute across three tiers:

TierWhoNormal targetFailure threshold
Commanderthe full commander fleet40–70%< 20% = STRUCTURAL-FAILURE
State3 captains (+ 5 colonels when used)20–40%< 10% or > 50% = STRUCTURAL-FAILURE
Hinata main-threadthe General working directly10–20%> 40% = STRUCTURAL-FAILURE

Bands sum to 70–130% (overlap intentional — soft enforcement).

Measurement window: 7 days rolling. This is the ONLY window. No toggle, no alternatives (1d/14d/30d removed 2026-06-08 — Michael: "toggle useless"). Studio displays 7d exclusively.

Verdicts

VerdictCondition
Strict PASSAll 3 bands within target
MILD-WARN1–2 bands outside target
STRUCTURAL-FAILUREAny band hits its failure threshold

Measurement

  • Tool: check-delegation-ratio.py --rolling-days 7 --json
  • Enforcement: session-end-check.sh at session end
  • Escalation: <40% delegation → SEVERE Telegram flag; persistent <40% over 3 sessions → systemic debt surfaced to Orochimaru

Never-halt law (hardened 2026-06-08)

Claude Code in acceptEdits mode MUST NOT stop execution unless a true bottleneck is hit:

True bottlenecks (the ONLY reasons to halt):

  • Source-of-truth unparseable
  • Filesystem unwriteable
  • Context exhausted (approaching 97%)
  • Unauthorised outward action (push, message, deploy)
  • Michael-blocked decision (needs his credentials, hands, or explicit choice)

NOT a bottleneck (keep executing):

  • "What next?" — if the queue has executable work, execute it
  • Interview questions about decisions already made — settled decisions don't reopen
  • Confirming plan approval when the plan is obviously correct
  • Waiting for subagent results when other independent work exists

Enforcement: fan out to commanders at ALL times. Every prompt turn that contains executable non-blocked work MUST dispatch at least one subagent. Confirmed decisions execute immediately in background. The main thread orchestrates and composes the final reply — it does not execute domain work.

Subagent runtime benchmarks

DurationVerdictAction
<5 minGREEN
5–10 minAMBER
10–15 minWARN
15–20 minESCALATETelegram + post-mortem
>20 minKILL-ELIGIBLEAbort, re-dispatch as two bounded units

Cloud gets a 20-min base ceiling. Explore/Plan agents are WARN-exempt.

Parallelism targets

  • Fan out as close to the 25-agent concurrency cap as work supports
  • Re-fan continuously — re-dispatch as units complete; never sit below ~20 agents while autonomous work exists
  • Split any unit estimated >30 min or >50k tokens into parallel sub-agents before dispatch

When NOT to delegate (main-thread is correct)

Reading a single known path · a one-line edit · a single Bash command · closing one task row · composing the final reply to Michael. Delegating these adds latency without protecting context.

Attribution (two axes — never conflate)

QuestionSignal
Delegation ratio — was work fragmented into subagents?subagent-span tokens ÷ eligible execution-session tokens
Commander attribution — which commander owned this burn?commander.attention.shift events × session-window JOIN

The named error: grouping burn by subagent_type — it only sees ~0.04% that flowed through Agent calls and misses the 99.96% in the main thread.


Extracted from supreme-court/runtime/runtime_workflow §7