Guide
Archiving Tick Data — Store once, query forever (fast, lossless, compressed)
This command extracts the tick stream from one or more IEX TOPS datasets — whether gzip-compressed or not — and stores it in an HDF5 file. Within the container, datasets are organized by trading date under the/irts group. These archived streams can later be reloaded using iex2h5 and transformed into RTS price matrices, OHLC series, or other derived formats — all without data loss or reprocessing.
From Ticks to Prices — resample to bars/OHLCV/RTS with one command.
iex2h5 -o ~/rts.h5 -c rts --time-interval 00:01:00
--date-range 2025-01-01:2025-01-31 --time-range 14:30:00-21:00:00 ~/iex.h5
Columns correspond to instruments — including inactive ones — and their positions are stable over time, thanks to the incremental instrument database. To drop unused columns, you can either:
- Apply a permutation vector in post-processing, or
- Use a custom instrument map during sampling (functionally equivalent to permuting the output matrix)
Managing Contracts (Symbols ⇄ IDs) (TBD/not yet implemented)
Quickstart Demo — Run IEX2H5 in 60 Seconds and Explore the Layout
This is the fastest way to get started: just point to your downloaded PCAP files and fire. By default, IEX2H5 extracts all tick-level data from the IEX stream and stores it in an HDF5 container — combining archival tick stream storage with regular interval price matrices in one go. It’s perfect for exploration or sharing small but complete datasets with others.Performance Benchmarks — tick-stream ingest & query vs other formats.
The IEX2H5 tool provides a powerful benchmarking framework for evaluating the performance of tick-stream ingestion and querying across multiple output formats—HDF5, CSV, JSON, and REDIS. It accepts raw or gzipped PCAP files as input and produces normalized tick streams in each target format, enabling direct comparison of storage footprint, I/O throughput, and query responsiveness.

While originally intended to highlight performance differences between formats, IEX2H5’s flexible CLI makes it a convenient converter as well. With simple invocations like iex2h5 -o ticks.h5 -c irts *.pcap, users can perform one-shot conversions from PCAP to any supported backend, including pushing real-time streams to Redis for downstream consumption.
The output format is deduced from -o or --output option depending how it ends .h5 | .csv | .json triggers the respective formats, whereas redis url must start with redis:// here is the full url format: redis://[[username:]password@]host[:port][/db]
Profiling — ballpark or measure performance with built-in tools
To benchmark the performance of iex2h5, a built-in profiling mode is available using gperftools. To enable it, build the project with profiling support by setting appropriate flags: compile with CXXFLAGS="-O3 -g" and configure CMake using -DUSE_GOOGLE_PROFILER=ON along with RelWithDebInfo build type. Then build the project normally with cmake --build build --parallel.
Once compiled, you can run a profiling session with a representative dataset:
This generates a profiling output file iex2h5.prof, which can be converted to Callgrind format using:
pprof-symbolize --callgrind ./build/src/iex2h5 iex2h5.prof > iex2h5.callgrind
kcachegrind iex2h5.callgrind
Profiling support is conditionally enabled in the CMake configuration when libprofiler is found:
When both USE_GOOGLE_PROFILER and GoogleProfiler_FOUND are true, HAVE_GOOGLE_PROFILER is defined and linked, allowing runtime hooks:
pprof-symbolize to generate interactive traces for kcachegrind. See gperftools and pprof for more on these tools.
Stopping/Exiting running software
Either hitting Ctrl+C or sending sending the signals below with killall -SIGINT iex2h5