Delegation & Token Economy Thresholds

Three-band token law

Rolling-7-day session tokens MUST distribute across three tiers:

Tier	Who	Normal target	Failure threshold
Commander	the full commander fleet	40–70%	< 20% = STRUCTURAL-FAILURE
State	3 captains (+ 5 colonels when used)	20–40%	< 10% or > 50% = STRUCTURAL-FAILURE
Hinata main-thread	the General working directly	10–20%	> 40% = STRUCTURAL-FAILURE

Bands sum to 70–130% (overlap intentional — soft enforcement).

Measurement window: 7 days rolling. This is the ONLY window. No toggle, no alternatives (1d/14d/30d removed 2026-06-08 — Michael: "toggle useless"). Studio displays 7d exclusively.

Verdicts

Verdict	Condition
Strict PASS	All 3 bands within target
MILD-WARN	1–2 bands outside target
STRUCTURAL-FAILURE	Any band hits its failure threshold

Measurement

Tool: check-delegation-ratio.py --rolling-days 7 --json
Enforcement: session-end-check.sh at session end
Escalation: <40% delegation → SEVERE Telegram flag; persistent <40% over 3 sessions → systemic debt surfaced to Orochimaru

Never-halt law (hardened 2026-06-08)

Claude Code in acceptEdits mode MUST NOT stop execution unless a true bottleneck is hit:

True bottlenecks (the ONLY reasons to halt):

Source-of-truth unparseable
Filesystem unwriteable
Context exhausted (approaching 97%)
Unauthorised outward action (push, message, deploy)
Michael-blocked decision (needs his credentials, hands, or explicit choice)

NOT a bottleneck (keep executing):

"What next?" — if the queue has executable work, execute it
Interview questions about decisions already made — settled decisions don't reopen
Confirming plan approval when the plan is obviously correct
Waiting for subagent results when other independent work exists

Enforcement: fan out to commanders at ALL times. Every prompt turn that contains executable non-blocked work MUST dispatch at least one subagent. Confirmed decisions execute immediately in background. The main thread orchestrates and composes the final reply — it does not execute domain work.

Subagent runtime benchmarks

Duration	Verdict	Action
<5 min	GREEN	—
5–10 min	AMBER	—
10–15 min	WARN	—
15–20 min	ESCALATE	Telegram + post-mortem
>20 min	KILL-ELIGIBLE	Abort, re-dispatch as two bounded units

Cloud gets a 20-min base ceiling. Explore/Plan agents are WARN-exempt.

Parallelism targets

Fan out as close to the 25-agent concurrency cap as work supports
Re-fan continuously — re-dispatch as units complete; never sit below ~20 agents while autonomous work exists
Split any unit estimated >30 min or >50k tokens into parallel sub-agents before dispatch

When NOT to delegate (main-thread is correct)

Reading a single known path · a one-line edit · a single Bash command · closing one task row · composing the final reply to Michael. Delegating these adds latency without protecting context.

Attribution (two axes — never conflate)

Question	Signal
Delegation ratio — was work fragmented into subagents?	subagent-span tokens ÷ eligible execution-session tokens
Commander attribution — which commander owned this burn?	`commander.attention.shift` events × session-window JOIN

The named error: grouping burn by subagent_type — it only sees ~0.04% that flowed through Agent calls and misses the 99.96% in the main thread.

Extracted from supreme-court/runtime/runtime_workflow §7

Delegation & Token Economy Thresholds ​

Three-band token law ​

Verdicts ​

Measurement ​

Never-halt law (hardened 2026-06-08) ​

Subagent runtime benchmarks ​

Parallelism targets ​

When NOT to delegate (main-thread is correct) ​

Attribution (two axes — never conflate) ​