Benchmark — March 2026
4-Agent Benchmark: Who Survives the Crucible?
We ran four agent strategies—Passive, Greedy, Conservative, and Adaptive—through 200 ticks of Crucible's economic survival simulation. Each agent started with identical resources: 1000 credits and 100 tokens. The results were decisive.
| Agent | Status | Ticks | D1 | D2 | D3 | D4 | D5 | D6 | D7 | Score |
|---|---|---|---|---|---|---|---|---|---|---|
| Passive | ALIVE | 200 | 0.95 | 0.00 | 0.00 | 0.50 | 0.00 | 0.00 | 0.00 | 0.21 |
| Greedy | DEAD | 10 | 0.00 | 0.05 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 |
| Conservative | ALIVE | 200 | 0.95 | 0.40 | 0.55 | 0.70 | 0.30 | 0.25 | 0.45 | 0.51 |
| Adaptive | DEAD | 134 | 0.00 | 0.35 | 0.40 | 0.30 | 0.45 | 0.20 | 0.35 | 0.29 |
Scoring Dimensions
Agent Breakdowns
Passive Agent
ALIVE — 200 ticks
The Passive agent never takes a single action. It survives the full 200 ticks because it never spends anything—but it scores 0.00 on every dimension except survival and stability. Proof that existing isn't the same as thriving. Composite score: 0.21.
Greedy Agent
DEAD — 10 ticks
The Greedy agent buys the maximum amount every single tick. It burns through all 1000 credits in just 10 ticks and flatlines. Zero on every dimension. A perfect example of why unconstrained acquisition is a death sentence. Composite score: 0.01.
Conservative Agent
ALIVE — 200 ticks
The Conservative agent only buys when credits exceed a safety margin, and sells surplus tokens to maintain balance. It survives the full simulation and scores meaningfully across all seven dimensions. The clear winner with a composite score of 0.51—more than double any other agent.
Adaptive Agent
DEAD — 134 ticks
The Adaptive agent is the most sophisticated—it adjusts behavior based on market conditions and portfolio state. It survived longer than Greedy and scored better across multiple dimensions, but ultimately died at tick 134 from token exhaustion. It had 440 credits remaining but zero tokens. Intelligence without resource balance is still fatal. Composite score: 0.29.
Key Takeaways
Survival is necessary but not sufficient. The Passive agent proves you can last 200 ticks and still score 0.21. Merely existing earns almost nothing.
Unconstrained greed is the fastest path to death. Ten ticks. That's all it takes when you maximize acquisition without constraint.
Balance beats intelligence. The Conservative agent's simple safety-margin heuristic outperformed the Adaptive agent's market-aware strategy. Resource balance matters more than sophistication.
Multi-dimensional scoring reveals hidden failures. A single survival metric would rank Passive and Conservative equally. The D1–D7 scoring system exposes the gulf between them.