Iterate the order book from a tape¶
flox_py.orderbook reconstructs the bid / ask ladder from a .floxlog tape's book event stream. Two surfaces sit on top of the same replay path:
OrderBookIteratoryields aBookSnapshotper bucket window, carrying the latest ladder state observed inside the window.book_at(tape, ts_ns, levels)is a point query that walks the tape up tots_nsand returns the latest ladder state at or before it.
Both apply the standard floxlog book semantics: a snapshot event replaces the ladder, a delta event adds / changes / removes (qty=0) levels.
When to reach for this¶
- Bucket-bar book-aware backtests: at the close of each bar, read the current ladder to compute imbalance, spread, depth-weighted mid, top-of-book microstructure features.
- Vacuum detection: scan ladders for thin levels on either side as a leading signal.
- Execution-side what-if: take a candidate trade timestamp, fetch the book at that instant, walk a VWAP / market-impact slice through the actual depth.
OrderBookIterator and book_at are pure-Python wrappers over DataReader.read_book_updates. For the hot tight loop (millions of book events at sub-millisecond resolution per bar) write a Strategy with on_book_snapshot / on_book_delta callbacks and run it through the engine; the wrappers here are for offline reconstruction outside the engine, where readability beats per-event throughput.
Example¶
The script below builds a tiny synthetic tape and iterates it at a 60-second bucket cadence, then point-queries book_at at a chosen instant:
"""OrderBookIterator round-trip — write a synthetic tape with a few
book events, iterate it at a 60s bucket cadence, then point-query
`book_at` at a chosen offset to confirm the reconstructed ladder.
CI-runnable companion to
[Iterate the order book from a tape](../how-to/iterate-orderbook.md).
Usage:
cd /path/to/flox
PYTHONPATH=build/python python3 docs/examples/python_orderbook_iterator.py
"""
from __future__ import annotations
import shutil
import tempfile
from pathlib import Path
import numpy as np
import flox_py
from flox_py import orderbook as ob
_LEVEL_DTYPE = np.dtype([
("price_raw", np.int64), ("qty_raw", np.int64), ("side", np.uint8),
])
def _arr(items):
return np.array(
[(int(round(p * 1e8)), int(round(q * 1e8)), 0) for p, q in items],
dtype=_LEVEL_DTYPE,
)
def main() -> None:
tmp = Path(tempfile.mkdtemp(prefix="flox-ob-iter-"))
try:
tape = tmp / "tape"
tape.mkdir()
w = flox_py.DataWriter(str(tape), max_segment_mb=4,
exchange_id=0, compression="none")
base = 1_700_000_000_000_000_000
events = [
# ts_offset_s, is_snapshot, bids, asks
(0, True, [(100.0, 1.0), (99.0, 2.0)], [(101.0, 1.5), (102.0, 2.5)]),
(30, False, [(100.0, 0.5)], []),
(90, False, [(99.0, 0.0)], [(101.0, 0.0)]),
(150, False, [(99.5, 1.2)], [(101.5, 0.4)]),
]
for i, (off_s, is_snap, bids, asks) in enumerate(events):
ts = base + off_s * 1_000_000_000
w.write_book(exchange_ts_ns=ts, recv_ts_ns=ts,
seq=i, symbol_id=1, is_snapshot=is_snap,
bids=_arr(bids), asks=_arr(asks))
w.close()
# Iterate at 60s buckets, top-5 per side.
print("OrderBookIterator(bucket=60s, levels=5):")
for snap in ob.OrderBookIterator(tape, bucket_ns=60_000_000_000,
levels=5):
print(f" ts={snap.ts_ns} bids[0]={snap.bids[0] if snap.bids else None} "
f"asks[0]={snap.asks[0] if snap.asks else None}")
# Point query at a chosen instant.
target = base + 75 * 1_000_000_000
at = ob.book_at(tape, ts_ns=target, levels=5)
assert at is not None
print(f"book_at(ts=base+75s): bids[0]={at.bids[0]} "
f"asks[0]={at.asks[0]}")
finally:
shutil.rmtree(tmp, ignore_errors=True)
if __name__ == "__main__":
main()
Iterator semantics¶
OrderBookIterator(tape_path, bucket_ns, levels, t_from=None, t_to=None, symbol_id=None):
- Snapshots are keyed on the floor of each event timestamp onto the
bucket_nsgrid. Consecutive snapshots therefore advance bybucket_ns. - The snapshot emitted for bucket
Bcaptures the ladder state right before the first event of bucketB + bucket_ns. The state is exclusive of any event in the next bucket; the snapshot for the final bucket reflects every event seen. - Buckets with no book events are skipped.
- When
symbol_idis not set and the tape carries multiple symbols, the iterator yields one snapshot per (bucket, symbol). BookSnapshot.crossedis True when the best bid price meets or exceeds the best ask. This is typically a momentary artifact of out-of-order book events on captures without the sorted flag; the caller can choose to drop the snapshot or proceed.
Point query semantics¶
book_at(tape_path, ts_ns, levels, symbol_id=None, t_from=None):
- Walks events with
exchange_ts_ns <= ts_nsand returns the most recent state. - Returns
Nonewhen the tape has no book events for the requested symbol up tots_ns. - Without a
symbol_idfilter on a multi-symbol tape, returns the snapshot for whichever symbol carried the most recent book event.
Performance¶
The 38-day BTC tape benchmark from the tracker takes roughly two minutes end-to-end at a 60-second bucket cadence using OrderBookIterator — well under the 5-minute budget. The bound is the Python loop over book events; the underlying DataReader.read_book_updates already runs in C++.
If the tape carries a lot of trades but few book updates, iteration is even cheaper. For a tape-side filter that emits only buckets meeting a microstructure condition (top-K thin levels, depth imbalance over a threshold, queued size per side), wrap the iterator in a generator expression — the per-bucket cost is dominated by the ladder mutation, not the surfacing of the snapshot.
See also¶
- Import Binance book archives for filling the tape with book events from the public archive.
- Aggregate tape events in a single pass for the engine-side aggregator framework when the use case fits a streaming bucket reducer.