Optimizing Coding Agent Efficiency With Model Delegation
- •Simon Willison advocates letting Claude Code agents exercise their own judgment to improve task efficiency.
- •Delegating coding tasks to subagents using smaller models like Haiku optimizes token usage and costs.
- •Willison's workflow reserves high-tier models for complex judgment-heavy tasks while automating trivial code edits.
On July 3, 2026, developer Simon Willison discussed a strategy for optimizing coding agent performance by allowing models like Fable and Opus to utilize their own judgment. During a recent fireside chat, team members from the Claude Code project at AIE suggested that instead of providing rigid instructions, users should instruct agents to determine their own testing requirements. This approach reduces unnecessary automated testing for trivial design or copy changes.
In a practical application of this, Willison implemented a workflow where coding tasks are delegated to subagents. By prompting the model to evaluate the difficulty of a task, it can select a lower-power model when appropriate. Specifically, for mechanical or trivial edits, the agent is directed to use Haiku, while substantive implementations are handled by Sonnet. Judgment-heavy tasks, including design, data synthesis, and code review, remain with the main, higher-tier model.
Willison noted that this delegation method has improved his personal efficiency while successfully slowing the depletion of his Fable token allowance. The configuration is stored as a memory file, which formalizes the instruction to prioritize cost and efficiency by ensuring that top-tier model capacity is reserved for complex reasoning rather than routine coding implementation.