Bridging HPC Structs and HDF5 COMPOUNDs with H5CPP
🚀 The Problem
You’re running simulations or doing scientific computing. You model data like this:
struct record_t {
double temp;
double density;
double B[3];
double V[3];
double dm[20];
double jkq[9];
};
````
Now you want to persist these structs into an HDF5 file using the COMPOUND datatype.With the standard C API? That means 20+ lines of verbose, error-prone setup. With H5CPP? Just include `struct.h` and let the tools handle the rest.
## 🔧 Step-by-Step with H5CPP
### 1. Define Your POD Struct
```cpp
namespace sn {
struct record_t {
double temp;
double density;
double B[3];
double V[3];
double dm[20];
double jkq[9];
};
}
2. Generate Type Descriptors
Invoke the H5CPP LLVM-based code generator:
It will emit a generated.h
file that defines a specialization for:
This registers an HDF5 compound type at runtime, automatically.
🧪 Example Usage
Here’s how you write/read a compound dataset with zero HDF5 ceremony:
#include "struct.h"
#include <h5cpp/core>
#include "generated.h"
#include <h5cpp/io>
int main(){
h5::fd_t fd = h5::create("test.h5", H5F_ACC_TRUNC);
// Create dataset with shape (70, 3, 3)
h5::create<sn::record_t>(fd, "/Module/g_data", h5::max_dims{70, 3, 3});
// Read it back
auto records = h5::read<std::vector<sn::record_t>>(fd, "/Module/g_data");
for (auto rec : records)
std::cerr << rec.temp << " ";
std::cerr << std::endl;
}
🔍 What generated.h
Looks Like
The generated descriptor maps your struct fields to HDF5 types:
template<> hid_t inline register_struct<sn::record_t>() {
hid_t ct = H5Tcreate(H5T_COMPOUND, sizeof(sn::record_t));
H5Tinsert(ct, "temp", HOFFSET(sn::record_t, temp), H5T_NATIVE_DOUBLE);
H5Tinsert(ct, "density", HOFFSET(sn::record_t, density), H5T_NATIVE_DOUBLE);
...
return ct;
}
B[3]
) are flattened using H5Tarray_create
, and all internal hid_t
handles are cleaned up.
🧵 Thread-safe and Leak-free
Generated code avoids resource leaks by closing array types after insertion, keeping everything safe and clean:
🧠 Why This Matters
HDF5 is excellent for structured scientific data. But the C API is boilerplate-heavy and distracts from the real logic. H5CPP eliminates this:
- Describe once, reuse everywhere
- Autogenerate glue code
- Zero-copy semantics, modern C++17 syntax
- Support for nested arrays and multidimensional shapes
✅ Conclusion
If you're working with scientific data in C++, H5CPP gives you the power of HDF5 with the simplicity of a header file. Skip the boilerplate. Focus on science.