Parallelization Patterns for HDF5 I/O in C++
The Question (Stefano Salvadè, Feb 19, 2018)
Goal: write analysis results in parallel to multiple HDF5 files—one per stream/process. The application is in C#, calling into a C/C++ API with HDF5 and MPI.
Current thought: typically one uses
mpiexec -n x Program.exe
, but spawning processes at runtime viaMPI_Spawn()
seems clunky.Is there a more elegant way to spawn parallel I/O functions within the same program? Also, do I need one process per write action (whether to separate files or a single shared file)?
Steven Varga’s Take—Less PHDF5, More Pragmatism
Parallel HDF5 (PHDF5) shines in setups with parallel file systems and true distributed environments—think HPC clusters with coordinated I/O capabilities.
But in simpler contexts—e.g., a single machine or cloud instance—PHDF5 often imposes unused complexity and file-system limitations (filters unsupported, extra boilerplate, etc.).
Instead, Stefano could:
1. Use separate HDF5 files per process, even in RAM or temp storage
2. Aggregate later, e.g. via:
- copying into one file, or
- using HDF5’s external file driver to compose them into a single logical container
The aggregation step could run as a separate batch job after the main MPI job finishes.
If you do have real parallel I/O infrastructure, then yes—PHDF5 gives benefits. But often, simple is better.
— Steve
Summary Table
Scenario | Recommended Approach | Reasoning |
---|---|---|
N streams → N separate files (no shared file) | Serial HDF5 per process | Simplicity, no PHDF5 overhead, independent files |
Need to combine results later | Aggregate post-run (external file driver or merge scripts) | Keeps initial write simple; flexible downstream processing |
True parallel I/O on a parallel file system | Use PHDF5 with MPI | Efficient coordinated I/O, but more complexity and system requirements |
When to Use What?
- Use PHDF5 when:
- You're in a high-performance cluster environment
- The file system supports parallel write throughput
-
You benefit from collective operations and synchronized metadata handling
-
Stick with serial HDF5 per process when:
- You're on a single system or cloud VM
- You want to avoid complexity in your write path
- You can afford a merge or collector step after the run
Wrap-Up Thoughts
Stefano’s “elegant parallel output within a single program” goal doesn’t necessarily require MPI-spawned processes or PHDF5. Often the simplest is best: spawn OS-level processes writing to their own files, then merge or link them later.
This keeps performance high, complexity low, and coordination overhead manageable.