Long SPY at the close, exit at the next session's open. Enter only when the prior 5-day intraday log return is negative. The classical "overnight returns are positive on average, intraday returns are roughly flat" finding becomes a tradeable signal once you condition entry on intraday weakness — the days when intraday holders have been beaten up are exactly the days overnight buyers re-mark the index.
Walk-forward test Sharpe 1.229, MaxDD -8.36 %, PF 1.25, 163 trades. Shipped as V174 scaffold on 2026-04-25 with flag WOLFX_OVERNIGHT_DRIFT_ENABLED=0. Production rollout contingent on 30 sessions of paper-shadow stability.
Academic lineage
Lou, Polk, Skouras, Overnight Returns and Firm-Specific Investor Sentiment, Journal of Financial Economics, 2019 — documents the persistent positive overnight drift across US equity markets and the lack of correlation between overnight and intraday returns.
Bogousslavsky, The cross-section of intraday and overnight returns, Journal of Financial Economics, 2021 — extends the finding to cross-section, shows the overnight component is doing most of the work in long-only equity returns.
Berkman, Koch, Tuttle, Paying Attention: Overnight Returns and the Hidden Cost of Buying at the Open, Journal of Financial and Quantitative Analysis, 2012 — earlier evidence with a market-microstructure framing.
The trade is not arbitrage; it is collecting a behavioural premium that intraday traders systematically pay overnight holders. The premium has persisted in published research across 1995-2020 and remains visible in 2024-2026 walk-forward data.
ENTER long SPY at MOC(t), exit at MOO(t+1) if filter_t < 0 HOLD CASH otherwise ```
The filter is on roughly 43 % of sessions. Variant A (always-on, no filter) was tested head-to-head against Variant B (filtered): Variant A test Sharpe 0.479, PF 1.09 — fails the gate, confirming the filter does real work.
Backtest (Round 14 PASS, Variant B)
Walk-forward 70/15/15 on 2016-01 through 2026-04. Costs: 1 bp slippage + 0.005 % commission round-trip = 1.5 bps total per overnight trade.
Metric
Test slice (2024-08 → 2026-04)
Gate
Result
Sharpe
1.229
≥ 0.40
PASS
MaxDD
-8.36 %
≤ 15 %
PASS
Profit factor
1.25
≥ 1.2
PASS
Trades
163
≥ 100
PASS
Full-window Sharpe
0.447
> 0
PASS
Train / Validation / Test Sharpe: 0.51 / 1.35 / 1.23 — clean walk-forward, no overfit signature, no regime cliff.
Honest caveats
Concentration. 100 % NAV in a single ETF overnight is a real exposure. Canary sizes at 20 % until 30 sessions of fill-quality data are collected. SPY is the single safest concentration on Earth, but a -10 % overnight gap (COVID weekend, August 2024 yen-carry unwind) hits at full size.
Full-window Sharpe 0.447 is the weakest passing metric. Train slice carried most of the cost drag (Sharpe 0.51) which pulls the full-window number down. Test slice is genuinely better.
MOC / MOO fill quality is the live execution risk. Yahoo's "Open" price is the official NYSE auction print and is what backtest assumes. In live, Alpaca/IBKR MOC orders generally fill at the official print, but cross-spread leakage on illiquid sessions could add 0.5-1 bps. The 1.5 bps cost model already tolerates this.
The filter is empirically chosen. Multiple academic papers document the underlying overnight drift, but the specific 5-day-prior-intraday-return-negative filter is not from a single paper — it's the cleaner of two variants we tested. Re-testing across 2026-2027 will tell us whether the filter holds out of sample.
Risks and failure modes
Regime shift in overnight liquidity. If the overnight market structure changes — e.g. extended-hours equity trading becomes deeply liquid and the "overnight gap" pricing inefficiency closes — the premium evaporates. Monitor: realised overnight return rolling 30-day average. If it drops below 0, the canary auto-disables.
Market shock at the close. A Fed surprise at 14:00 ET that triggers a 16:00 ET cascade could open the next morning at -3 %. The MaxDD of -8.36 % observed empirically across 2024-2026 includes such days; it is not theoretical.
Filter inversion. If the prior-5-day-intraday-negative condition starts predicting negative overnight returns instead of positive, that's the same kind of regime inversion that killed PEAD (Round < 7). The 30-day rolling Sharpe monitor is the canary's circuit breaker.
Canary protocol
Flag WOLFX_OVERNIGHT_DRIFT_ENABLED=0 keeps execution dormant. V174 scheduler runs the filter compute + signal log every weekday at 15:55 ET, regardless of flag.
Canary sizing: 20 % NAV for first 30 sessions.
Promotion gate: rolling 30-session Sharpe ≥ 0.5 with realised slippage < 3 bps. After that, scale toward 100 % NAV in steps of 20 percentage points per fortnight.
Hard stop: -3 % NAV in any 2-week rolling window flats the position and disables the flag for 30 days.
Daily slippage tracking against the 1.5 bps cost model — if 30-day rolling slippage exceeds 3 bps, sizing reverts to 20 % regardless of return profile.
What this is not
This is not market-making. It is not high-frequency. It is one trade per session, placed at 15:55 ET, exited at the next open. Average hold duration: ~17 hours. The signal is rooted in a known and persistent behavioural premium, not in microstructure latency advantages.
---
This whitepaper is descriptive, not prescriptive. WOLFX publishes every signal and every realised fill. Past performance, including walk-forward backtest performance, is not predictive of future results. Trading ETFs involves substantial risk of loss.