Skip to content

The Original Ask (David Schneider, Dec 10 2016)

David from SLAC/LCLS posed this neat challenge:

“Is it possible to implement an in‑memory circular buffer using HDF5? We'd like both offline (on‑disk append) and online (shared‑memory overwrite) access via the same HDF5 schema and API, possibly using SWMR for shared-memory consumption.” :contentReference[oaicite:1]{index=1} Basically, a single schema that works both for archival and real-time consumption—elegant.

1. Werner's Insight: The Virtual File Driver

Werner dropped the elegant solution:

“There is a virtual file driver that operates on a memory image of an HDF5 file. It should be no problem to have this one also operate on shared memory.” :contentReference[oaicite:2]{index=2}

That’s referencing HDF5’s core VFD—you can treat a pointer to memory (including shared memory via mmap or shm_open) as if it were an HDF5 file. The same dataset API (H5Dcreate, H5Dwrite, etc.) applies, so you can reuse your schema seamlessly.

2. Steve’s Real‑World Twist (HFT-inspired)

Steve Varga chimed with a production-grade twist:

“Boost’s circular/ring buffer handles one-writer-many-readers; tail flushing can be channeled to the writer or fault‑tolerant hosts. Combine with ZeroMQ + Protocol Buffers or Thrift.”
“For experiments—where failure isn't critical—you can just access HDF5 locally on cluster nodes using MPI + Grid Engine + serial HDF5.” :contentReference[oaicite:3]{index=3}

So if you're doing industrial-strength durability, go ring buffer + messaging middleware. For HPC experiments where speed and simplicity triump, stick with HDF5+MPI.

Quick API Sketch (Julia‑Flavored)

```julia using HDF5, SharedMemory # hypothetical module?

fid = h5open_sharedmem(shm_address, mode="r+")

Use HDF5 API as if working on a real file

dset = d_create(fid, "/buffer", datatype=Float64, dims=(N,), maxdims=(HDF5.UNLIMITED,)) write(dset, new_chunk) close(fid) ````

Summary Table

Scenario Approach Notes
On-disk appendable buffer HDF5 datasets (append mode) Standard functionality
In-memory circular buffer HDF5 via core VFD over a memory region Shared schema/API in RAM
High‑throughput, production-grade Boost ring buffer + messaging (ZeroMQ, ProtoBuf) More robust, fault-tolerant
Experimental/distributed HPC HDF5 per node + MPI/Scheduler (serial HDF5) Simple, performance-focused