Walk-forward
Walk-forward backtesting¶
Walk-forward splits the time series into successive train / test windows, runs the strategy through each pair, and returns per-fold stats. Useful as a sanity check on overfitting: if the strategy looks great on the train half but degrades on the held-out test half across folds, the backtest is overfit.
flox_py.WalkForwardRunner ships in Python. The same primitive is in
flox.WalkForwardRunner for Node and through the C ABI for Codon
(flox_walk_forward_run_csv).
Modes¶
anchored — the train window starts at bar 0 and grows. Each fold
trains on [0, t], tests on [t, t + test_size], then t advances by
step. Set min_train_size to skip the first folds where the train
window is too small.
sliding — the train window is fixed-size and slides forward. Each
fold trains on [t, t + train_size], tests on
[t + train_size, t + train_size + test_size], t advances by step.
Python¶
import flox_py as flox
reg = flox.SymbolRegistry()
btc = reg.add_symbol("exchange", "BTCUSDT", 0.01)
class SmaCross(flox.Strategy):
def __init__(self, syms):
super().__init__(syms)
self.fast = flox.SMA(10)
self.slow = flox.SMA(30)
def on_trade(self, ctx, t):
f = self.fast.update(t.price)
s = self.slow.update(t.price)
if f is None or s is None or not self.slow.ready:
return
if f > s and ctx.is_flat():
self.market_buy(0.01)
elif f < s and ctx.is_flat():
self.market_sell(0.01)
wfr = flox.WalkForwardRunner(
reg, fee_rate=0.0004, initial_capital=10_000,
mode="anchored", test_size=100, step=100, min_train_size=100,
)
# Factory called twice per fold (train, then test). Build a fresh
# strategy every time — state from a prior fold must not leak.
wfr.set_strategy_factory(lambda fold_index: SmaCross([btc]))
folds = wfr.run_csv("data/btcusdt_sample.csv", "BTCUSDT")
for f in folds:
print(f"fold {f['fold_index']}: "
f"train return={f['train_stats']['return_pct']:+.4f}% "
f"sharpe={f['train_stats']['sharpe']:+.4f} | "
f"test return={f['test_stats']['return_pct']:+.4f}% "
f"sharpe={f['test_stats']['sharpe']:+.4f}")
The factory pattern is non-negotiable: the engine calls it once per window with no shortcut for "reuse my strategy". A leaked indicator buffer or position counter from a prior fold would silently corrupt the next fold's stats.
Node¶
const flox = require('flox-node');
const reg = new flox.SymbolRegistry();
const btc = reg.addSymbol('exchange', 'BTCUSDT', 0.01);
const wfr = new flox.WalkForwardRunner(reg, 0.0004, 10000, {
mode: 'anchored', testSize: 100, step: 100, minTrainSize: 100,
});
wfr.setStrategyFactory((foldIndex) => {
const fast = new flox.SMA(10);
const slow = new flox.SMA(30);
return {
symbols: [Number(btc)],
onTrade(ctx, t, emit) {
const f = fast.update(t.price);
const s = slow.update(t.price);
if (f === null || s === null || !slow.ready) return;
if (f > s && ctx.position === 0) emit.marketBuy(0.01);
else if (f < s && ctx.position === 0) emit.marketSell(0.01);
},
};
});
const folds = wfr.runCsv('data/btcusdt_sample.csv', 'BTCUSDT');
folds.forEach(f => console.log(f.foldIndex, f.testStats.returnPct));
What you get back per fold¶
{
"fold_index": 0,
"train_start_bar": 0, "train_end_bar": 100,
"test_start_bar": 100, "test_end_bar": 200,
"train_start_ns": ..., "train_end_ns": ...,
"test_start_ns": ..., "test_end_ns": ...,
"train_stats": { ... full BacktestStats ... },
"test_stats": { ... full BacktestStats ... },
}
The two *_stats blocks are the same shape as BacktestRunner.run_csv
returns. Compute aggregate statistics (mean / median / variance over
folds) on the client side — the runner does not aggregate for you on
purpose, since useful aggregates depend on what you are looking for
(robustness vs. average performance vs. worst case).
What walk-forward does not do¶
It does not optimise hyperparameters per fold. If you need that, run grid search on each fold's train slice yourself, pick the best params, then evaluate on test. That pattern is the standard "walk-forward optimisation" but it is opinionated enough that the runner stays out of it — compose the primitives.