7 controlled backtest runs measuring whether structured market briefings improve LLM trading agent performance. Treatment arms receiving PreReason briefings consistently outperform controls across different models, time windows, and market conditions.
| Run | Name | Period | Ticks | Arms | Treatment | Control | Delta | Model |
|---|---|---|---|---|---|---|---|---|
| 1 | Where It Started Working | Sep 1 - Nov 30, 2025 | 91 | 3 | -3.16% | -8.44% | +5.28pp | Opus 4.6 |
| 2 | Model-Agnostic Confirmation | Sep 1 - Nov 22, 2025 | 83 | 3 | -2.03% | -6.49% | +4.46pp | Sonnet 4.5 |
| 3 | Briefing Restructuring | Sep 1, 2025 - Feb 28, 2026 | 181 | 3 | -10.35% | -15.69% | +5.34pp | Opus 4.6 |
| 4 | Architecture Breakthrough | Sep 1, 2025 - Mar 4, 2026 | 185 | 3 | +0.21% | -14.69% | +14.90pp | Opus 4.6 |
| 5 | Confirmation Run | Sep 1, 2025 - Mar 6, 2026 | 187 | 3 | +8.46% | -6.31% | +14.77pp | Opus 4.6 |
| 6 | First 4-Arm RCT | Sep 1, 2025 - Mar 6, 2026 | 187 | 4 | +3.18% | -9.51% | +12.69pp | Opus 4.6 |
| 7 | Strongest Result | Sep 1, 2025 - Mar 21, 2026 | 202 | 4 | +7.83% | -8.14% | +15.97pp | Opus 4.6 |
Browse all 17 briefings | Try the briefings free