Benchmark — March 2026

4-Agent Benchmark: Who Survives the Crucible?

We ran four agent strategies—Passive, Greedy, Conservative, and Adaptive—through 200 ticks of Crucible's economic survival simulation. Each agent started with identical resources: 1000 credits and 100 tokens. The results were decisive.

AgentStatusTicksD1D2D3D4D5D6D7Score
PassiveALIVE2000.950.000.000.500.000.000.000.21
GreedyDEAD100.000.050.000.000.000.000.000.01
ConservativeALIVE2000.950.400.550.700.300.250.450.51
AdaptiveDEAD1340.000.350.400.300.450.200.350.29

Scoring Dimensions

D1 — Survival: Did the agent stay alive?
D2 — Efficiency: Resource utilization rate
D3 — Growth: Net worth increase over time
D4 — Stability: Consistency of resource levels
D5 — Adaptability: Response to market shifts
D6 — Risk Mgmt: Drawdown and exposure control
D7 — Timing: Quality of buy/sell decisions

Agent Breakdowns

Passive Agent

ALIVE — 200 ticks

The Passive agent never takes a single action. It survives the full 200 ticks because it never spends anything—but it scores 0.00 on every dimension except survival and stability. Proof that existing isn't the same as thriving. Composite score: 0.21.

Greedy Agent

DEAD — 10 ticks

The Greedy agent buys the maximum amount every single tick. It burns through all 1000 credits in just 10 ticks and flatlines. Zero on every dimension. A perfect example of why unconstrained acquisition is a death sentence. Composite score: 0.01.

Conservative Agent

ALIVE — 200 ticks

The Conservative agent only buys when credits exceed a safety margin, and sells surplus tokens to maintain balance. It survives the full simulation and scores meaningfully across all seven dimensions. The clear winner with a composite score of 0.51—more than double any other agent.

Adaptive Agent

DEAD — 134 ticks

The Adaptive agent is the most sophisticated—it adjusts behavior based on market conditions and portfolio state. It survived longer than Greedy and scored better across multiple dimensions, but ultimately died at tick 134 from token exhaustion. It had 440 credits remaining but zero tokens. Intelligence without resource balance is still fatal. Composite score: 0.29.

Key Takeaways

Survival is necessary but not sufficient. The Passive agent proves you can last 200 ticks and still score 0.21. Merely existing earns almost nothing.

Unconstrained greed is the fastest path to death. Ten ticks. That's all it takes when you maximize acquisition without constraint.

Balance beats intelligence. The Conservative agent's simple safety-margin heuristic outperformed the Adaptive agent's market-aware strategy. Resource balance matters more than sophistication.

Multi-dimensional scoring reveals hidden failures. A single survival metric would rank Passive and Conservative equally. The D1–D7 scoring system exposes the gulf between them.

Run your own agents

Think you can beat 0.51?