AI 비교하기AI 사용하기AI 최신정보AI 커뮤니티
Our VisionTermsPrivacyContact

Multi-Model Finance Simulation Uses Heterogeneous Small Agents

Multi-Model Finance Simulation Uses Heterogeneous Small Agents

HuggingFace Blog
Sunday, June 7, 2026
  • •Thousand Token Wood v2 simulates a market economy using four heterogeneous small language models.
  • •Engineers utilized a robust JSON parsing layer and a strict firewall to manage agent information.
  • •System memory is bounded by integer-derived summaries to prevent prompt inflation and maintain agent behavior.
  • •Thousand Token Wood v2 simulates a market economy using four heterogeneous small language models.
  • •Engineers utilized a robust JSON parsing layer and a strict firewall to manage agent information.
  • •System memory is bounded by integer-derived summaries to prevent prompt inflation and maintain agent behavior.

On June 6, 2026, developer Lester Leong released the second version of 'Thousand Token Wood,' a multi-agent finance simulation game where players act as a 'Patron' manipulating a market of creatures. Unlike the first version, which featured five creatures running on a single fine-tuned 0.5B model, v2 utilizes a heterogeneous council of four models from different laboratories: gpt-oss-20b, MiniCPM3-4B, Nemotron-Mini-4B, and a custom fine-tuned Qwen 0.5B. This diverse setup aims to create more realistic market dynamics, as each model exhibits different behavioral traits like hoarding or speculation.

Technical implementation of the multi-model architecture highlighted challenges primarily at the serving layer rather than the modeling layer. Using vLLM version 0.22.1, the developer encountered kernel compilation failures that were resolved by switching to a CUDA development base image. Each model required specific configurations, such as handling different quantization formats and tokenizer idiosyncrasies. To maintain stability, the system uses a centralized, robust JSON parse-and-repair layer that filters model outputs regardless of their internal formatting habits, allowing for seamless integration through simple configuration entries rather than extensive refactoring.

The core gameplay mechanics rely on an information asymmetry system where the player provides 'insider tips' to agents. To ensure the truthfulness of these tips remains hidden, the system implements a strict security firewall. Hidden flags indicating whether a tip is true or false are stored in a player ledger entirely outside of the model prompts. A automated suite runs a scan on every creature's full prompt each turn to verify that no banned tokens containing secret flags leak into the agent's context.

Memory management in v2 is handled by bounding history to prevent prompt inflation, which often degrades performance in small models. Instead of raw history, creatures receive a one-line bucketed summary derived from integer sentiment values. This bounded approach forces the model to respond based on current relationship states, such as hostility or cartel behavior, while keeping the simulation performant. A representative simulation run confirmed the system's effectiveness: the 0.5B model achieved 100% valid offers with 0% self-buys, and the security firewall successfully prevented all leaks of hidden flags while allowing the player to manipulate market outcomes through strategic information delivery.

On June 6, 2026, developer Lester Leong released the second version of 'Thousand Token Wood,' a multi-agent finance simulation game where players act as a 'Patron' manipulating a market of creatures. Unlike the first version, which featured five creatures running on a single fine-tuned 0.5B model, v2 utilizes a heterogeneous council of four models from different laboratories: gpt-oss-20b, MiniCPM3-4B, Nemotron-Mini-4B, and a custom fine-tuned Qwen 0.5B. This diverse setup aims to create more realistic market dynamics, as each model exhibits different behavioral traits like hoarding or speculation.

Technical implementation of the multi-model architecture highlighted challenges primarily at the serving layer rather than the modeling layer. Using vLLM version 0.22.1, the developer encountered kernel compilation failures that were resolved by switching to a CUDA development base image. Each model required specific configurations, such as handling different quantization formats and tokenizer idiosyncrasies. To maintain stability, the system uses a centralized, robust JSON parse-and-repair layer that filters model outputs regardless of their internal formatting habits, allowing for seamless integration through simple configuration entries rather than extensive refactoring.

The core gameplay mechanics rely on an information asymmetry system where the player provides 'insider tips' to agents. To ensure the truthfulness of these tips remains hidden, the system implements a strict security firewall. Hidden flags indicating whether a tip is true or false are stored in a player ledger entirely outside of the model prompts. A automated suite runs a scan on every creature's full prompt each turn to verify that no banned tokens containing secret flags leak into the agent's context.

Memory management in v2 is handled by bounding history to prevent prompt inflation, which often degrades performance in small models. Instead of raw history, creatures receive a one-line bucketed summary derived from integer sentiment values. This bounded approach forces the model to respond based on current relationship states, such as hostility or cartel behavior, while keeping the simulation performant. A representative simulation run confirmed the system's effectiveness: the 0.5B model achieved 100% valid offers with 0% self-buys, and the security firewall successfully prevented all leaks of hidden flags while allowing the player to manipulate market outcomes through strategic information delivery.

Read original (English)·Jun 6, 2026
#multi agent#small language models#vllm#simulation#agentic ai#model serving