Skip to content

Zero-Cost C++ Structs to HDF5 Compound Types with H5CPP

🧬 The Setup: From HPC POD Structs to HDF5 COMPOUNDs

You're handed an HDF5 file with compound data like this:

DATASET "g_data" {
  DATATYPE  H5T_COMPOUND {
    H5T_IEEE_F64LE "temp";
    H5T_IEEE_F64LE "density";
    H5T_ARRAY { [3] H5T_IEEE_F64LE } "B";
    H5T_ARRAY { [3] H5T_IEEE_F64LE } "V";
    H5T_ARRAY { [20] H5T_IEEE_F64LE } "dm";
    H5T_ARRAY { [9] H5T_IEEE_F64LE } "jkq";
  }
  DATASPACE SIMPLE { ( 70, 3, 3 ) / ( 70, 3, 3 ) }
}
With H5CPP, you don't need to touch the C API. Just describe this with a plain-old C++ struct:

namespace sn {
    struct record_t {
        double temp;
        double density;
        double B[3];
        double V[3];
        double dm[20];
        double jkq[9];
    };
}

That’s it.

🧰 Write Code as If You Had a Magic HDF5 Interface

No manual H5Tinsert, no dataspace juggling:

#include <vector>
#include "struct.h"
#include <h5cpp/core>
#include "generated.h"  // <-- auto-generated with h5cpp compiler
#include <h5cpp/io>

int main(){
    h5::fd_t fd = h5::create("test.h5", H5F_ACC_TRUNC);
    h5::create<sn::record_t>(fd, "/Module/g_data", h5::max_dims{70, 3, 3});

    auto dataset = h5::read<std::vector<sn::record_t>>(fd, "/Module/g_data");
    for (const auto& rec : dataset)
        std::cerr << rec.temp << " ";
    std::cerr << std::endl;
}

Compile and run:

h5cpp struct.cpp -- -std=c++17 -I/usr/include -Dgenerated.h
g++ struct.o -lhdf5 -lz -ldl -lm -o struct
./struct

🧠 Under the Hood: Codegen Output

H5CPP automatically generates a minimal type descriptor that registers your struct with the HDF5 type system:

template<> hid_t inline register_struct<sn::record_t>(){
    // array declarations omitted for brevity
    hid_t ct = H5Tcreate(H5T_COMPOUND, sizeof(sn::record_t));
    H5Tinsert(ct, "temp", HOFFSET(sn::record_t, temp), H5T_NATIVE_DOUBLE);
    ...
    return ct;
}

Everything is cleaned up (H5Tclose(...)) to avoid resource leaks.

πŸ“š Can I Extract One Field From a Compound?

"Can I create a virtual dataset from just the 'density' field in g_data?"

Short answer: yes, but not easily.

HDF5 virtual datasets (VDS) work best with entire datasets, not fields of compound types. However, a workaround is to:

  • Use hyperslab selections to read only the density field in software.
  • Or, repack the data using H5CPP into a derived dataset containing only the fields you want.

Future H5CPP support might wrap this more cleanly, but currently, you'll want to do the extraction in-memory.


βœ… Summary

  • Describe your data in C++ once.
  • H5CPP generates the boilerplate and builds your type descriptors.
  • You can now read, write, and manipulate rich compound datasets without touching the C API.
  • Clean syntax, high performance, zero leaks.

Ready to ditch boilerplate and focus on real work?