Single-Thread Writer: Simplifying Parallel HDF5 I/O
HDF5 has a global lock: no matter how many threads you spawn, only one can execute HDF5 library calls at a time. If you naïvely let multiple threads hammer the library, you get serialization at best and deadlocks at worst.
The solution? One writer thread. All producers hand off data to it via a queue; it alone touches the HDF5 API.
Design Pattern
- Producers (sensor readers, network handlers, simulators) run freely.
- They push their data into a lock-free or bounded queue.
- A single dedicated writer thread pops from the queue and performs all HDF5 calls (
H5Dwrite
,H5Dset_extent
, etc.).
This way, the library never sees concurrent calls, and your application avoids global-lock contention.
Example Flow
```cpp // producers void producer(queue_t& q, int id) { for (int i = 0; i < 100; i++) { record_t rec{id, i}; q.push(rec); } }
// consumer/writer void writer(queue_t& q, h5::ds_t& ds) { record_t rec; while (q.pop(rec)) { h5::append(ds, rec); // all HDF5 I/O is serialized here } } ````
Thread Coordination
- Producers run independently.
- The writer drains the queue at its own pace.
- When producers finish, they signal termination, and the writer flushes any remaining data before closing the file.
Benefits
- ✅ Correctness: no race conditions inside HDF5.
- ✅ Performance: eliminates global-lock thrashing.
- ✅ Simplicity: no need for per-thread file handles or MPI gymnastics.
In practice, a well-implemented queue keeps throughput high enough to saturate disk bandwidth. For bursty workloads, batching writes can further smooth performance.
When to Use
- Multithreaded producers feeding a single HDF5 container.
- Applications where correctness and predictability outweigh fine-grained parallel writes.
- Prototypes that may later evolve into MPI-based distributed writers.
Takeaway
HDF5 isn’t thread-parallel—but your architecture can be. Push all I/O through a single writer thread, and let your other threads do what they do best: generate data without blocking.