Skip to content

Bridging HPC Structs and HDF5 COMPOUNDs with H5CPP

🚀 The Problem

You’re running simulations or doing scientific computing. You model data like this:

struct record_t {
    double temp;
    double density;
    double B[3];
    double V[3];
    double dm[20];
    double jkq[9];
};
````

Now you want to persist these structs into an HDF5 file using the COMPOUND datatype.With the standard C API? That means 20+ lines of verbose, error-prone setup. With H5CPP? Just include `struct.h` and let the tools handle the rest.


## 🔧 Step-by-Step with H5CPP
### 1. Define Your POD Struct

```cpp
namespace sn {
    struct record_t {
        double temp;
        double density;
        double B[3];
        double V[3];
        double dm[20];
        double jkq[9];
    };
}

2. Generate Type Descriptors

Invoke the H5CPP LLVM-based code generator:

h5cpp struct.cpp -- -std=c++17 -I. -Dgenerated.h

It will emit a generated.h file that defines a specialization for:

h5::register_struct<sn::record_t>()

This registers an HDF5 compound type at runtime, automatically.

🧪 Example Usage

Here’s how you write/read a compound dataset with zero HDF5 ceremony:

#include "struct.h"
#include <h5cpp/core>
#include "generated.h"
#include <h5cpp/io>

int main(){
    h5::fd_t fd = h5::create("test.h5", H5F_ACC_TRUNC);

    // Create dataset with shape (70, 3, 3)
    h5::create<sn::record_t>(fd, "/Module/g_data", h5::max_dims{70, 3, 3});

    // Read it back
    auto records = h5::read<std::vector<sn::record_t>>(fd, "/Module/g_data");
    for (auto rec : records)
        std::cerr << rec.temp << " ";
    std::cerr << std::endl;
}

🔍 What generated.h Looks Like

The generated descriptor maps your struct fields to HDF5 types:

template<> hid_t inline register_struct<sn::record_t>() {
    hid_t ct = H5Tcreate(H5T_COMPOUND, sizeof(sn::record_t));
    H5Tinsert(ct, "temp", HOFFSET(sn::record_t, temp), H5T_NATIVE_DOUBLE);
    H5Tinsert(ct, "density", HOFFSET(sn::record_t, density), H5T_NATIVE_DOUBLE);
    ...
    return ct;
}
Nested arrays (like B[3]) are flattened using H5Tarray_create, and all internal hid_t handles are cleaned up.

🧵 Thread-safe and Leak-free

Generated code avoids resource leaks by closing array types after insertion, keeping everything safe and clean:

H5Tclose(at_00); H5Tclose(at_01); H5Tclose(at_02);

🧠 Why This Matters

HDF5 is excellent for structured scientific data. But the C API is boilerplate-heavy and distracts from the real logic. H5CPP eliminates this:

  • Describe once, reuse everywhere
  • Autogenerate glue code
  • Zero-copy semantics, modern C++17 syntax
  • Support for nested arrays and multidimensional shapes

✅ Conclusion

If you're working with scientific data in C++, H5CPP gives you the power of HDF5 with the simplicity of a header file. Skip the boilerplate. Focus on science.