Backtest Methodology: 4-Arm Randomized Controlled Trial

Full transparency on how we test whether PreReason briefings improve AI trading agent decisions. Our methodology uses a randomized controlled trial design with 4 arms and 8 controlled variables.

The 4 Arms

8 Controlled Variables

  1. Same LLM -- All arms use the same model (Opus 4.6 or Sonnet 4.5) within each run
  2. Same Trading Rules -- Identical position limits, allowed actions, and portfolio constraints
  3. Same Fee Structure -- 0.045% transaction fees, 0.01% funding per 8 hours for all arms
  4. Same Tick Frequency -- All arms process ticks at the same intervals
  5. Fresh Context Per Tick -- Each tick runs in a fresh Claude Code agent with no memory of prior ticks
  6. Shared Portfolio State -- Portfolio carryover via deterministic disk replay, identical across arms
  7. Full Audit Trail -- Every decision, reasoning trace, and portfolio state is recorded
  8. No Human Intervention -- Agent decisions are never overridden or altered

Training Data Cutoff

Opus 4.6 was trained through August 2025. Every tick in our backtests runs from September 2025 through March 2026. The model has never seen this data during training. This is genuine out-of-distribution testing.

Explore Results

Browse the briefings we tested | Try them free