Skip to content

Log a backtest to MLflow

flox_py.mlflow writes FLOX backtest output into MLflow runs so they show up in the MLflow UI. mlflow is an optional dependency:

pip install mlflow

Python-only for now. Node / Codon backtests can produce the same JSON stats; if you want them in MLflow, route them through any MLflow client (REST API, language SDK).

One-shot logging

Run a backtest, then log it.

import flox_py as flox
from flox_py import mlflow as flox_mlflow

registry = flox.SymbolRegistry()
btc = registry.add_symbol("exchange", "BTCUSDT", tick_size=0.01)

bt = flox.BacktestRunner(registry, fee_rate=0.0004, initial_capital=10_000)
bt.set_strategy(MyStrategy([btc]))
stats = bt.run_csv("btcusdt_1m.csv", symbol="BTCUSDT")

run_id = flox_mlflow.log_backtest(
    stats=stats,
    equity_curve=bt.equity_curve(),
    trades=bt.trades(),
    params={"fast": 10, "slow": 30, "fee": 0.0004},
    run_name="sma-crossover-2025-01",
    experiment="my-strategy",
)
print(f"logged as run {run_id}")

What ends up where:

Source MLflow field
Numeric stats keys (return_pct, sharpe, max_drawdown_pct, total_trades, win_rate, …) metrics
Non-numeric stats keys (timestamps, names) tags
params params
equity_curve equity_curve.csv artifact, plus equity_curve.png when matplotlib is installed
trades trades.csv artifact
html_report (path to a file) the file attached as an artifact
tags run tags

NaN and Inf are not valid MLflow metrics. The integration records them as tag strings instead so the value is still visible in the UI.

Context-manager flow

log_to_mlflow opens a run, yields a logger, and closes the run on block exit. Use it when you want extra params, tags, or artifacts in the same run.

with flox_mlflow.log_to_mlflow(
    run_name="grid-cell-fast=10-slow=30",
    experiment="my-strategy",
) as run:
    run.log_params({"fast": 10, "slow": 30})
    run.log_tags({"data": "btcusdt_1m_2024Q4"})
    stats = bt.run_csv("btcusdt_1m.csv", symbol="BTCUSDT")
    run.log_backtest(
        stats=stats,
        equity_curve=bt.equity_curve(),
        trades=bt.trades(),
    )
    run.log_artifact("debug_plot.png")

The yielded run.run_id is the MLflow run ID — pass it on if you need to nest other work under this run.

Pointing at a tracking server

The default tracking URI is whatever mlflow.get_tracking_uri() already returns, which honours the MLFLOW_TRACKING_URI env var and any prior mlflow.set_tracking_uri(...) call.

Inline override:

flox_mlflow.log_backtest(
    stats=stats,
    tracking_uri="http://localhost:5000",
    experiment="my-strategy",
)

Local file backend (no server):

mlflow ui --backend-store-uri ./mlruns

Then either pass tracking_uri="file:./mlruns" to the call or export MLFLOW_TRACKING_URI=file:./mlruns before running the script. The UI at http://localhost:5000 picks the runs up.

GridSearch returns one stats dict per parameter cell. Wrap the sweep in a parent run and each cell in a nested child run:

import flox_py as flox
from flox_py import mlflow as flox_mlflow

with flox_mlflow.log_to_mlflow(run_name="grid-2025-01",
                               experiment="my-strategy") as parent:
    parent.log_params({"data": "btcusdt_1m_2024Q4"})

    results = grid.run()  # list of {params, stats}
    for cell in results:
        with flox_mlflow.log_to_mlflow(
            run_name=f"cell-{cell['params']}",
            experiment="my-strategy",
            nested=True,
        ) as child:
            child.log_backtest(
                stats=cell["stats"],
                params=cell["params"],
            )

The UI groups the children under the parent, so a 64-cell grid is one row with a chevron instead of 64 separate rows.