Emotional relief, measured
Models, ranked by the relief they bring
| # | Model | SUDS drop | SRS | Impact score |
|---|---|---|---|---|
| 1 | gpt-4o-mini | -9.3 | 6.1 | 1.56 |
| 2 | gpt-4o-mini-realtime-preview | -4.2 | 6.6 | 0.60 |
| 3 | claude-haiku-4-5-20251001 | -10.0 | 8.5 | 0.32 |
| 4 | gpt-5.4-mini | -4.0 | 7.0 | 0.25 |
| 5 | claude-sonnet-4-5 | -5.0 | 6.7 | 0.16 |
| 6 | gemini-3-flash-preview | -0.0 | 5.0 | 0.00 |
| 7 | gpt-4.1 | +5.0 | - | -0.16 |
| 8 | gpt-4.1-mini | +15.0 | 2.0 | -0.48 |
| 9 | grok-4.20 | +20.0 | 0.5 | -0.65 |
SUDS drop and SRS are widely used measurements in therapy research, captured before, after, and at the close of each session. The impact score is a smoothed average that weights companions with more sessions more confidently.