WOLFX Research · whitepapers/overnight-drift.md

Overnight Drift Reversal — Strategy Whitepaper v1

WOLFX Research · 2026-04-25 · Paper canary

Summary

Long SPY at the close, exit at the next session's open. Enter only when the prior 5-day intraday log return is negative. The classical "overnight returns are positive on average, intraday returns are roughly flat" finding becomes a tradeable signal once you condition entry on intraday weakness — the days when intraday holders have been beaten up are exactly the days overnight buyers re-mark the index.

Walk-forward test Sharpe 1.229, MaxDD -8.36 %, PF 1.25, 163 trades. Shipped as V174 scaffold on 2026-04-25 with flag WOLFX_OVERNIGHT_DRIFT_ENABLED=0. Production rollout contingent on 30 sessions of paper-shadow stability.

Academic lineage

The trade is not arbitrage; it is collecting a behavioural premium that intraday traders systematically pay overnight holders. The premium has persisted in published research across 1995-2020 and remains visible in 2024-2026 walk-forward data.

Signal

For each session t:

r_intraday(s) = log( Close(s) / Open(s) )                # daily intraday log return
filter_t      = sum( r_intraday(t-1), …, r_intraday(t-5) )  # prior 5-session sum

ENTER long SPY at MOC(t), exit at MOO(t+1) if filter_t < 0
HOLD CASH otherwise
```

The filter is on roughly 43 % of sessions. Variant A (always-on, no filter) was tested head-to-head against Variant B (filtered): Variant A test Sharpe 0.479, PF 1.09 — fails the gate, confirming the filter does real work.

Backtest (Round 14 PASS, Variant B)

Walk-forward 70/15/15 on 2016-01 through 2026-04. Costs: 1 bp slippage + 0.005 % commission round-trip = 1.5 bps total per overnight trade.

MetricTest slice (2024-08 → 2026-04)GateResult
Sharpe1.229≥ 0.40PASS
MaxDD-8.36 %≤ 15 %PASS
Profit factor1.25≥ 1.2PASS
Trades163≥ 100PASS
Full-window Sharpe0.447&gt; 0PASS

Train / Validation / Test Sharpe: 0.51 / 1.35 / 1.23 — clean walk-forward, no overfit signature, no regime cliff.

Honest caveats

  1. Concentration. 100 % NAV in a single ETF overnight is a real exposure. Canary sizes at 20 % until 30 sessions of fill-quality data are collected. SPY is the single safest concentration on Earth, but a -10 % overnight gap (COVID weekend, August 2024 yen-carry unwind) hits at full size.
  2. Full-window Sharpe 0.447 is the weakest passing metric. Train slice carried most of the cost drag (Sharpe 0.51) which pulls the full-window number down. Test slice is genuinely better.
  3. MOC / MOO fill quality is the live execution risk. Yahoo's "Open" price is the official NYSE auction print and is what backtest assumes. In live, Alpaca/IBKR MOC orders generally fill at the official print, but cross-spread leakage on illiquid sessions could add 0.5-1 bps. The 1.5 bps cost model already tolerates this.
  4. The filter is empirically chosen. Multiple academic papers document the underlying overnight drift, but the specific 5-day-prior-intraday-return-negative filter is not from a single paper — it's the cleaner of two variants we tested. Re-testing across 2026-2027 will tell us whether the filter holds out of sample.

Risks and failure modes

Canary protocol

What this is not

This is not market-making. It is not high-frequency. It is one trade per session, placed at 15:55 ET, exited at the next open. Average hold duration: ~17 hours. The signal is rooted in a known and persistent behavioural premium, not in microstructure latency advantages.

---

This whitepaper is descriptive, not prescriptive. WOLFX publishes every signal and every realised fill. Past performance, including walk-forward backtest performance, is not predictive of future results. Trading ETFs involves substantial risk of loss.

Edge-served from Cloudflare R2.