Skip to content

Index

Extendable Datasets in HDF5—Simplified with H5CPP

The OP’s Context

The user wanted to create an HDF5 dataset that can grow over time—think event streams or log records—and do so with a clean C++ interface. Writing their own solution was on the table, but they were seeking something performant and maintainable.

H5CPP to the Rescue (Steven Varga, Mar 2 2023)

“Probably you want to roll your own, which is a good thing—but in case you're looking for a performant solution:”

#include <h5cpp/core>
#include "generated.h"  // your record type mapping
#include <h5cpp/io>
int main(){
  auto fd = h5::create("test.h5", H5F_ACC_TRUNC);
  { // Create an extendable dataset with your POD struct
    auto ds = h5::create<sn::record_t>(
      fd, "/path/dataset", h5::max_dims{H5S_UNLIMITED});
    // Assign vector-of-strings attribute
    ds["attribute"] = {"first","second","...","last"};
    // Convert dataset to packet-table and append records
    auto pt = ds;
    for (int i = 0; i < 3; ++i) {
      h5::append(pt, sn::record_t{
        1.0 * i, 2.0 *i,
        {1,2,3,4,5}, {11,12,13,14,15}
      });
    }
  }
  { // Read back the dataset
    auto ds = h5::open(fd, "/path/dataset");
    auto attribute = h5::aread<std::vector<std::string>>(ds, "attribute");
    std::cout << attribute << std::endl;
    // Dump data
    for (auto rec : h5::read<std::vector<sn::record_t>>(ds, "/path/dataset"))
      std::cerr << rec.A << " ";
    std::cerr << std::endl;
  }
}

Why This Works So Nicely

Feature Benefit
H5S_UNLIMITED max dims Enables true extendible dataset
sn::record_t POD mapping Compact and expressive schema definitions
h5::append(...) API Simple, zero-boilerplate appends
Packet-table behind the scenes Efficient I/O under the hood
Vector attribute support Seamless metadata attachment and retrieval

TL;DR

Creating appendable, extendable datasets in C++ is no longer boilerplate-heavy. H5CPP gives you:

  • C++ templates for structure mapping
  • Clean append logic with h5::append()
  • Flexible storage with unlimited dataspace
  • Convenient metadata via attributes

Cross-language Messaging with ZeroMQ: C++, Python, and Fortran

🧩 Setup

We’ll use ZeroMQ’s PUSH/PULL pattern to build a minimal, language-agnostic messaging microservice.

Each sender (written in Python, C++, or Fortran) pushes 64-bit values to a single receiver.

📦 Dependencies

Linux (Debian/Ubuntu):
sudo apt-get install libzmq3-dev
````

#### Python:

```bash
python3 -m pip install pyzmq
Fortran ZMQ bindings:

Install fzmq:

git clone https://github.com/richsnyder/fzmq && cd fzmq
mkdir build && cd build
cmake -DBUILD_DOCS=OFF ../
sudo make install

🛰️ Sender: Python

#!/usr/bin/python3
import zmq

ctx = zmq.Context()
sock = ctx.socket(zmq.PUSH)
sock.connect("tcp://localhost:5555")

for x in range(100):
    sock.send(x.to_bytes(8, 'little'))

Highlights:

  • Sends 8-byte integers in little-endian format
  • Easy for testing other receivers

🛰️ Sender: C++

#include <zmq.h>
#include <cstdint>
#include <cstring>

int main() {
    void* ctx = zmq_ctx_new();
    void* sock = zmq_socket(ctx, ZMQ_PUSH);
    zmq_connect(sock, "tcp://localhost:5555");

    for(uint64_t x = 0; x < 100; ++x)
        zmq_send(sock, &x, sizeof(x), 0);

    zmq_close(sock);
    zmq_ctx_term(ctx);
}

Highlights:

  • Sends raw uint64_t values (8 bytes)
  • Matches Python format

🛰️ Sender: Fortran

program send
  use fzmq
  implicit none

  type(zmq_context) :: context
  type(zmq_socket)  :: sock
  integer(kind=8)   :: x
  integer           :: rc

  context = zmq_ctx_new()
  sock = zmq_socket(context, ZMQ_PUSH)
  call zmq_connect(sock, "tcp://localhost:5555")

  do x = 0, 99
    call zmq_send(sock, x, 8, 0)
  end do

  call zmq_close(sock)
  call zmq_ctx_term(context)
end program send

Highlights:

  • Uses fzmq bindings
  • Sends binary 64-bit integers

🎯 Receiver: C++

#include <zmq.h>
#include <cstdint>
#include <cstdio>

int main() {
    void* ctx = zmq_ctx_new();
    void* sock = zmq_socket(ctx, ZMQ_PULL);
    zmq_bind(sock, "tcp://*:5555");

    for(uint64_t x; true;) {
        zmq_recv(sock, &x, sizeof(x), 0);
        printf("recv: %lu\n", x);
    }

    zmq_close(sock);
    zmq_ctx_term(ctx);
}

Highlights:

  • Pulls 64-bit integers from any connected sender
  • No language-specific deserialization required

✅ Summary

ZeroMQ’s PUSH/PULL pattern makes multi-language IPC a breeze.

Role Language Notes
Sender Python Uses pyzmq, clean syntax
Sender C++ Raw zmq_send of binary integers
Sender Fortran Uses fzmq bindings
Receiver C++ Prints values from all senders

Run the receiver first, then launch any combination of senders. Messages will stream in regardless of language. No serialization frameworks, no boilerplate.

Automated Pretty Printing for STL-like Containers in H5CPP

🛠️ Why not just print?

Debugging C++ is an art and often demands the setup of tools like valgrind, cachegrind, or gdb. These tools are powerful, but sometimes all you need is a quick look at what's inside your container.

Enter: H5CPP’s feature detection idiom-based pretty-printing system for STL-like containers.

Inspired by Walter Brown’s WG21 N4436 paper, we offer a mechanism that allows you to simply:

std::cout << stl_like_object;
````

Where `stl_like_object` can be any type that:

* provides `.begin()` and `.end()` (like vectors, lists, maps)
* offers stack/queue-like interfaces (`.top()`, `.pop()`, `.size()`)
* or even composite/ragged containers like `vector<vector<T>>`, `tuple<...>`, etc.

## 🔍 What it does

The current implementation supports:

* **Recursive pretty-printing**
* **Line width control** (via `H5CPP_CONSOLE_WIDTH`)
* **In-line visualization** of arbitrarily nested structures

This will eventually replace/enhance the H5CPP persistence layer with general I/O capabilities.

---

## 📦 Example Output

Here's what `./pprint-test` prints to `stdout`:

```text
LISTS/VECTORS/SETS:
---------------------------------------------------------------------
   array<string,7>:[xSs,wc,gu,Ssi,Sx,pzb,OY]
            vector:[CDi,PUs,zpf,Hm,teO,XG,bu,QZs]
             deque:[256,233,23,89,128,268,69,278,130]
              list:[95,284,24,124,49,40,200,108,281,251,57, ...]
      forward_list:[147,76,81,193,44]
               set:[V,f,szy,v]
     unordered_set:[2.59,1.86,2.93,1.78,2.43,2.04,1.69]
          multiset:[3,5,12,21,23,28,30,30]
unordered_multiset:[gZ,rb,Dt,Q,Ark,dW,Ez,wmE,GwF]

And yes, it continues with:

  • Adaptors: stack, queue, priority_queue
  • Associative Containers: map, multimap, unordered_map, ...
  • Ragged Arrays: like vector<vector<string>>
  • Tuples and Pairs: including nested structures

Here’s a teaser for ragged and tuple structures:

vector<vector<string>>:[[pkwZZ,lBqsR,cmKt,PDjaS,Zj],[Nr,jj,xe,uC,bixzV],[uBAU,pXCa,fZEH,FIAIO]]
pair<int,string>:{4:upgdAdbvIB}
tuple<string,int,float,short>:<NHCmzhVVXJ,8,2.01756,7>

🧪 Run it yourself

g++ -I./include -o pprint-test.o -std=c++17 -DFMT_HEADER_ONLY -c pprint-test.cpp
g++ pprint-test.o -lhdf5 -lz -ldl -lm -o pprint-test
./pprint-test

You can set the line width by defining H5CPP_CONSOLE_WIDTH (e.g. -DH5CPP_CONSOLE_WIDTH=10).

📁 Code snippet

#include <h5cpp/H5Uall.hpp>
#include <h5cpp/H5cout.hpp>
#include <utils/types.hpp>

std::cout << mock::data<vector<vector<string>>>::get(2, 5, 3, 7) << "\n";

🔮 What’s Next?

If you’d like to bring this up to the level of Julia’s pretty print system, get in touch! The data generators that support arbitrary C++ type generation are part of the larger h5rnd project — a Prüfer sequence-based HDF5 dataset generator.

HDF5 Group Overhead: Measuring the Cost of a Million Groups

📦 The Problem

On [HDF5 forum][1], a user posed a question many developers eventually run into:

"We want to serialize an object tree into HDF5, where every object becomes a group. But group overhead seems large. Why is my 5 MB dataset now 120 MB?"

That’s a fair question. And we set out to quantify it.

⚙️ Experimental Setup

We generated 1 million unique group names using h5::utils::get_test_data<std::string>(N), then created an HDF5 group for each name at the root level of an HDF5 file using H5CPP.

for (size_t n = 0; n < N; n++)
    h5::gr_t{H5Gcreate(root, names[n].data(), H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT)};

We measured:

  • Total HDF5 file size on disk
  • Total size of the strings used
  • Net overhead from metadata alone

📊 Results

Total file size      :  79,447,104 bytes (~77MB)
Total string payload :   1,749,862 bytes
---------------------------------------------
Net metadata overhead: ~77MB

Average per group    : ~776 bytes

Yep. Each group costs roughly 776 bytes, even when it contains no datasets or attributes.

📈 Visual Summary

Entry Count File Size Payload Size Overhead Avg/Group
1,000,000 77.94 MB 1.75 MB ~76.2 MB ~776 B

🧠 Why So Expensive?

HDF5 groups are not just simple folders—they are implemented using B-trees and heaps. Each group object has:

  • A header
  • Link messages
  • Heap storage for names
  • Possibly indexed storage for lookup

This structure scales well for access, but incurs overhead for each group created.

🛠 Can Compression Help?

No. Compression applies to datasets, not group metadata. Metadata (including group structures) is stored in an uncompressed format by default.

💡 Recommendations

  • Avoid deep or wide group hierarchies with many small entries
  • If representing an object tree:

  • Consider flat structures with table-like metadata

  • Store object metadata as compound datasets or attributes
  • If you're tracking time-series or per-sample metadata:

  • Store as datasets with indexing, not groups

🔚 Final Thoughts

HDF5 is flexible—but that flexibility has a price when misapplied. Using groups to represent every atomic item or configuration object results in significant metadata bloat.

Use them judiciously. Use datasets liberally.

Custom Floating-Point and Opaque Types in HDF5

Extended precision floating-point (long double) is a common headache in data persistence. While HDF5 does support H5T_NATIVE_LDOUBLE, the inspection tools (h5dump) often misreport the stored numbers. Fortunately, H5CPP allows you to define custom datatypes—falling back to opaque storage when necessary.

Custom Type Definition

A specialization with H5T_OPAQUE lets you capture the raw 80-bit (or 128-bit) layout without worrying about architecture quirks:

namespace h5::impl::detail {
    template <>
    struct hid_t<opaque::ldouble_t, H5Tclose, true, true, hdf5::type>
        : public dt_p<opaque::ldouble_t> {
        using parent = dt_p<opaque::ldouble_t>;
        using parent::hid_t;
        using hidtype = opaque::ldouble_t;

        hid_t() : parent(H5Tcreate(H5T_OPAQUE, sizeof(opaque::ldouble_t))) {
            hid_t id = static_cast<hid_t>(*this);
        }
    };
}

This ensures your values are faithfully written and retrievable—even if the dumper chokes on them.

Example Output

A dataset written as H5T_NATIVE_LDOUBLE might display as garbage in h5dump:

DATASET "custom" {
   DATATYPE  H5T_NATIVE_LDOUBLE
   DATA {
   (0): 4.94066e-324, 4.94066e-324, ...
   }
}

…but the opaque fallback shows the raw byte patterns:

DATASET "opaque" {
   DATATYPE H5T_OPAQUE { OPAQUE_TAG "" }
   DATA {
   (0): 59:16:f5:f3:bb:e2:28:b8:01:40:00:00:00:00:00:00,
   (1): 21:93:2c:c5:cc:f5:5b:90:00:40:00:00:00:00:00:00,
   ...
   }
}

Why Two Views?

  • H5T_NATIVE_LDOUBLE: portable but misprinted by h5dump.
  • H5T_OPAQUE: exact bytes preserved, great for debugging or custom parsers.

On AMD64 systems, long double is stored in 16 bytes but only the first 10 bytes are significant. The last 6 are tail padding with undefined contents. This is why treating the type as opaque makes sense when fidelity is critical.

Beyond Long Double

You’re not limited to long double. With H5CPP you can adapt the same approach to:

  • half precision (float16)
  • nbit packed integers
  • arbitrary bit-level encodings

See the H5CPP examples for twobit, nbit, and half-float.

Takeaway

  • ✅ Use H5T_NATIVE_LDOUBLE when you want logical portability.
  • ✅ Wrap as OPAQUE when you need raw fidelity and control.
  • ⚠️ Don’t panic when h5dump shows nonsense—the data is safe.

With H5CPP, you get the flexibility to represent any custom precision format—whether for simulation accuracy, bit-packed encodings, or raw experimental data.

HDF5 and long double: Precision Stored, Precision Misread

When working with scientific simulations, precision matters. Many codes rely on long double to squeeze out a few more digits of accuracy. The good news: HDF5 supports H5T_NATIVE_LDOUBLE natively, and with H5CPP you can write and read long double seamlessly.

The bad news: h5dump, the standard HDF5 inspection tool, stumbles. Instead of your carefully written values, you’ll often see tiny denormalized numbers (4.94066e-324) or other junk. This isn’t corruption—it’s just h5dump misinterpreting extended precision types.

Minimal Example

Consider the following snippet:

#include "h5cpp/all"
#include <vector>

int main() {
    std::vector<long double> x{0.0L, 0.01L, 0.02L, 0.03L, 0.04L,
                               0.05L, 0.06L, 0.07L, 0.08L, 0.09L};

    h5::fd_t fd = h5::create("test.h5", H5F_ACC_TRUNC);
    h5::ds_t ds = h5::create<long double>(fd, "homogenious", h5::current_dims{5,3}, h5::chunk{1,3});
    h5::write(ds, x);
}
Running the code and dumping with h5dump:

h5dump -d /homogenious test.h5

DATA {
(0,0): 4.94066e-324, 4.94066e-324, 4.94066e-324,
...
}

Looks broken, right? But if you read back the dataset with HDF5 or H5CPP:

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

…the values are correct. The underlying file is perfectly fine.

Why the Mismatch?

h5dump uses its own format string and assumes a particular binary layout for floating-point numbers. On many systems, long double is 80-bit extended precision or 128-bit quad precision, which doesn’t map cleanly to the dumper’s print logic. Hence the nonsense output.

In other words: the storage layer is solid, but the diagnostic tool lags behind.

Compound Types with long double

HDF5 compound types with H5T_NATIVE_LDOUBLE also work, including arrays of extended-precision fields:

DATASET "stream-of-records" {
   DATATYPE  H5T_COMPOUND {
      H5T_NATIVE_LDOUBLE "temp";
      H5T_NATIVE_LDOUBLE "density";
      H5T_ARRAY { [3] H5T_NATIVE_LDOUBLE } "B";
      H5T_ARRAY { [3] H5T_NATIVE_LDOUBLE } "V";
      ...
   }
   DATASPACE SIMPLE { ( 10 ) / ( H5S_UNLIMITED ) }
   STORAGE_LAYOUT { CHUNKED ( 512 ) }
   FILTERS { COMPRESSION DEFLATE { LEVEL 9 } }
}

Here too, h5dump shows garbage, but reading with HDF5 APIs returns the expected values.

Takeaway

  • Write long double safely with HDF5/H5CPP.
  • Read long double safely with HDF5/H5CPP.
  • Don’t trust h5dump for inspecting long double datasets.

Example to rewrite attributes

While it is not possible to append/extend attributes in HDF5, attributes often represent side band information with relatevily small size. In fact in previous HDF5 versions the attribute size was limited to 64kb, however Gerd Heber suggest this limitation has been lifted.

Haveing said the above it is a good strategy to break up append operation to read old dataset and write new dataset operations. The implementation is straightforward, and when used properly also is performant.

#include <vector>
#include <armadillo>
#include <h5cpp/all>

int main(void) {
    h5::fd_t fd = h5::create("h5cpp.h5",H5F_ACC_TRUNC);
    arma::mat data(10,5);

    { // 
    h5::ds_t ds = h5::write(fd,"some dataset", data);  // write dataset, and obtain descriptor
    h5::awrite(ds, "attribute name", {1,2,3,4,5,6,7});
    }
}

will give you the following layout

h5dump -a /some_dataset/attribute_name  h5cpp.h5
HDF5 "h5cpp.h5" {
ATTRIBUTE "attribute_name" {
   DATATYPE  H5T_STD_I32LE
   DATASPACE  SIMPLE { ( 7 ) / ( 7 ) }
   DATA {
   (0): 1, 2, 3, 4, 5, 6, 7
   }
}
}
To update the attribute you need to remove it first, since H5CPP doesn't yet do this automatically; in fact there is no h5::adelete either! -- however by design you can interchange HDF5 C API calls with H5CPP templates, so here is the update with H5Adelete and h5::awrite:

H5Adelete(ds,  "attribute name");
h5::awrite(ds, "attribute name", values);
h5dump -a /some_dataset/attribute_name  h5cpp.h5
HDF5 "h5cpp.h5" {
ATTRIBUTE "attribute_name" {
   DATATYPE  H5T_STD_I32LE
   DATASPACE  SIMPLE { ( 14 ) / ( 14 ) }
   DATA {
   (0): 1, 2, 3, 4, 5, 6, 7, 20, 21, 22, 23, 24, 25, 26
   }
}
}

Single-Thread Writer: Simplifying Parallel HDF5 I/O

HDF5 has a global lock: no matter how many threads you spawn, only one can execute HDF5 library calls at a time. If you naïvely let multiple threads hammer the library, you get serialization at best and deadlocks at worst.

The solution? One writer thread. All producers hand off data to it via a queue; it alone touches the HDF5 API.

Design Pattern

  1. Producers (sensor readers, network handlers, simulators) run freely.
  2. They push their data into a lock-free or bounded queue.
  3. A single dedicated writer thread pops from the queue and performs all HDF5 calls (H5Dwrite, H5Dset_extent, etc.).

This way, the library never sees concurrent calls, and your application avoids global-lock contention.

Example Flow

```cpp // producers void producer(queue_t& q, int id) { for (int i = 0; i < 100; i++) { record_t rec{id, i}; q.push(rec); } }

// consumer/writer void writer(queue_t& q, h5::ds_t& ds) { record_t rec; while (q.pop(rec)) { h5::append(ds, rec); // all HDF5 I/O is serialized here } } ````

Thread Coordination

  • Producers run independently.
  • The writer drains the queue at its own pace.
  • When producers finish, they signal termination, and the writer flushes any remaining data before closing the file.

Benefits

  • Correctness: no race conditions inside HDF5.
  • Performance: eliminates global-lock thrashing.
  • Simplicity: no need for per-thread file handles or MPI gymnastics.

In practice, a well-implemented queue keeps throughput high enough to saturate disk bandwidth. For bursty workloads, batching writes can further smooth performance.

When to Use

  • Multithreaded producers feeding a single HDF5 container.
  • Applications where correctness and predictability outweigh fine-grained parallel writes.
  • Prototypes that may later evolve into MPI-based distributed writers.

Takeaway

HDF5 isn’t thread-parallel—but your architecture can be. Push all I/O through a single writer thread, and let your other threads do what they do best: generate data without blocking.

Fixed-Length vs. Variable-Length Storage in HDF5

HDF5 gives you two ways to store “string-like” or array-like data: fixed-length and variable-length. Each comes with trade-offs, and we benchmarked them head-to-head.

The Setup

We compared writing large arrays of simple POD records, stored either as:

  • Fixed-length fields: every record has the same size.
  • Variable-length fields: each record may grow or shrink.

The benchmark (hdf5-fixed-length-bench.cpp) measures throughput for millions of writes, simulating common HPC/quant workloads.

#include <iostream>
#include <vector>
#include <algorithm>
#include <h5bench>
#include <h5cpp/core>
#include "non-pod-struct.hpp"
#include <h5cpp/io>
#include <fmt/core.h>
#include <fstream>

namespace bh = h5::bench;
bh::arg_x record_size{10'000}; //, 100'000, 1'000'000};
bh::warmup warmup{3};
bh::sample sample{10};
h5::dcpl_t chunk_size = h5::chunk{4096};

std::vector<size_t> get_transfer_size(const std::vector<std::string>& strings ){
    std::vector<size_t> transfer_size;
    for (size_t i =0, j=0, N = 0; i < strings.size(); i++){
        N += strings[i].length();
        if( i == record_size[j] - 1) j++, transfer_size.push_back(N);
    }
    return transfer_size;
}

template<class T> std::vector<T> convert(const std::vector<std::string>& strings){
    return std::vector<T>();
}
template <> std::vector<char[shim::pod_t::max_lenght::value]> convert(const std::vector<std::string>& strings){
    std::vector<char[shim::pod_t::max_lenght::value]> out(strings.size());
    for (size_t i = 0; i < out.size(); i++)
        strncpy(out[i], strings[i].data(), shim::pod_t::max_lenght::value);
    return out;
}

std::vector<const char*> get_data(const std::vector<std::string>& strings){
    std::vector<const char*> data(strings.size());
    // build a array of pointers to VL strings: one level of indirection 
    for (size_t i = 0; i < data.size(); i++)
        data[i] = (char*) strings[i].data();
    return data;
}

std::vector<h5::ds_t> get_datasets(const h5::fd_t& fd, const std::string& name, h5::bench::arg_x& rs){
    std::vector<h5::ds_t> ds;

    for(size_t i=0; i< rs.rank; i++)
        ds.push_back( h5::create<std::string>(fd, fmt::format(name + "-{:010d}", rs[i]), h5::current_dims{rs[i]}, chunk_size));

    return ds;
}

int main(int argc, const char **argv){
    size_t max_size = *std::max_element(record_size.begin(), record_size.end());

    h5::fd_t fd = h5::create("h5cpp.h5", H5F_ACC_TRUNC);
    auto strings = h5::utils::get_test_data<std::string>(max_size, 10, shim::pod_t::max_lenght::value);

    // LETS PRINT PUT SOME STRINGS TO GIVE YOU THE PICTURE
    fmt::print("[{:5>}] [{:^30}] [{:6}]\n", "#", "value", "lenght");
    for(size_t i=0; i<10; i++) fmt::print("{:>2d}  {:>30}  {:>8d}\n", i, strings[i], strings[i].length());
    fmt::print("\n\n");

    { // POD: FIXED LENGTH STRING + ID
        h5::pt_t ds = h5::create<shim::pod_t>(fd, "FLstring h5::append<pod_t>", h5::max_dims{H5S_UNLIMITED}, chunk_size);
        std::vector<shim::pod_t> data(max_size);
        // we have to copy the string into the pos struct
        for (size_t i = 0; i < data.size(); i++)
            data[i].id = i, strncpy(data[i].name, strings[i].data(), shim::pod_t::max_lenght::value);

        // compute data transfer size, we will be using this to measure throughput:
        std::vector<size_t> transfer_size;
        for (auto i : record_size)
            transfer_size.push_back(i * sizeof(shim::pod_t));

        // actual measurement with burn in phase
        bh::throughput(
            bh::name{"FLstring h5::append<pod_t>"}, record_size, warmup, sample, ds,
            [&](hsize_t idx, hsize_t size) -> double {
                for (hsize_t k = 0; k < size; k++)
                    h5::append(ds, data[k]);
                return transfer_size[idx];
            });
    }

    { // VL STRING, INDEXED BY HDF5 B+TREE, h5::append<std::string>
        h5::pt_t ds = h5::create<std::string>(fd, "VLstring h5::append<std::vector<std::string>> ", h5::max_dims{H5S_UNLIMITED}, chunk_size);
        std::vector<size_t> transfer_size = get_transfer_size(strings);
        // actual measurement with burn in phase
        bh::throughput(
            bh::name{"VLstring h5::append<std::vector<std::string>>"}, record_size, warmup, sample,
            [&](hsize_t idx, hsize_t size) -> double {
                for (hsize_t i = 0; i < size; i++)
                    h5::append(ds, strings[i]);
                return transfer_size[idx];
            });
    }
    { // VL STRING, INDEXED BY HDF5 B+TREE std::vector<std::string>
        auto ds = get_datasets(fd, "VLstring h5::write<std::vector<const char*>> ", record_size);
        std::vector<const char*> data = get_data(strings);
        std::vector<size_t> transfer_size = get_transfer_size(strings);

        // actual measurement with burn in phase
        bh::throughput(
            bh::name{"VLstring h5::write<std::vector<const char*>>"}, record_size, warmup, sample,
            [&](hsize_t idx, hsize_t size) -> double {
                h5::write(ds[idx], data.data(), h5::count{size});
                return transfer_size[idx];
            });
    }

    { // VL STRING, INDEXED BY HDF5 B+TREE std::vector<std::string>
        auto ds = get_datasets(fd, "VLstring std::vector<std::string> ", record_size);
        std::vector<size_t> transfer_size = get_transfer_size(strings);
        // actual measurement with burn in phase
        bh::throughput(
            bh::name{"VLstring std::vector<std::string>"}, record_size, warmup, sample,
            [&](hsize_t idx, hsize_t size) -> double {
                h5::write(ds[idx], strings, h5::count{size});
                return transfer_size[idx];
            });
    }

    { // FL STRING, INDEXED BY HDF5 B+TREE std::vector<std::string>
        using fixed_t = char[shim::pod_t::max_lenght::value]; // type alias

        std::vector<size_t> transfer_size;
        for (auto i : record_size)
            transfer_size.push_back(i * sizeof(fixed_t));
        std::vector<fixed_t> data = convert<fixed_t>(strings);

        // modify VL type to fixed length
        h5::dt_t<fixed_t> dt{H5Tcreate(H5T_STRING, sizeof(fixed_t))};
        H5Tset_cset(dt, H5T_CSET_UTF8); 

        std::vector<h5::ds_t> ds;
        for(auto size: record_size) ds.push_back(
                h5::create<fixed_t>(fd, fmt::format("FLstring CAPI-{:010d}", size), 
                chunk_size, h5::current_dims{size}, dt));

        // actual measurement
        bh::throughput(
            bh::name{"FLstring CAPI"}, record_size, warmup, sample,
            [&](hsize_t idx, hsize_t size) -> double {
                // memory space
                h5::sp_t mem_space{H5Screate_simple(1, &size, nullptr )};
                H5Sselect_all(mem_space);
                // file space
                h5::sp_t file_space{H5Dget_space(ds[idx])};
                H5Sselect_all(file_space);

                H5Dwrite( ds[idx], dt, mem_space, file_space, H5P_DEFAULT, data.data());
                return transfer_size[idx];
            });
    }

    { // Variable Length STRING with CAPI IO calls
        std::vector<size_t> transfer_size = get_transfer_size(strings);
        std::vector<const char*> data = get_data(strings);

        h5::dt_t<char*> dt;
        std::vector<h5::ds_t> ds;

        for(auto size: record_size) ds.push_back(
            h5::create<char*>(fd, fmt::format("VLstring CAPI-{:010d}", size), 
            chunk_size, h5::current_dims{size}));

        // actual measurement
        bh::throughput(
            bh::name{"VLstring CAPI"}, record_size, warmup, sample,
            [&](hsize_t idx, hsize_t size) -> double {
                // memory space
                h5::sp_t mem_space{H5Screate_simple(1, &size, nullptr )};
                H5Sselect_all(mem_space);
                // file space
                h5::sp_t file_space{H5Dget_space(ds[idx])};
                H5Sselect_all(file_space);

                H5Dwrite( ds[idx], dt, mem_space, file_space, H5P_DEFAULT, data.data());
                return transfer_size[idx];
            });
    }

    { // C++ IO stream
        std::vector<size_t> transfer_size = get_transfer_size(strings);
        std::ofstream stream;
        stream.open("somefile.txt", std::ios::out);

        // actual measurement
        bh::throughput(
            bh::name{"C++ IOstream "}, record_size, warmup, sample,
            [&](hsize_t idx, hsize_t size) -> double {
                for (hsize_t k = 0; k < size; k++)
                    stream << strings[k] << std::endl;
                return transfer_size[idx];
            });
        stream.close();
    }
}

Results

  • Fixed-length outperforms variable-length by a wide margin.
  • Predictable size means HDF5 can lay out data contiguously and stream it efficiently.
  • Variable-length introduces extra indirection and heap management, slowing things down.

In our runs, fixed-length writes achieved 70–95% of raw I/O speed, while variable-length lagged substantially behind.

Why It Matters

  • If your schema permits it, prefer fixed-length types.
  • Use variable-length only when data sizes truly vary (e.g., ragged arrays, free-form strings).
  • For high-frequency trading, sensor arrays, or scientific simulations, fixed-length layouts maximize throughput.

POD Check

We also verified which record types qualify as POD (Plain Old Data) via a small utility (is-pod-test.cpp). Only POD-eligible types map safely and efficiently into HDF5 compound layouts.

```cpp static_assert(std::is_trivial_v); static_assert(std::is_standard_layout_v); ````

This ensures compatibility with direct binary writes—no surprises from constructors, vtables, or hidden padding.

Takeaway

  • ✅ Fixed-length fields: fast, predictable, near raw I/O.
  • ⚠️ Variable-length fields: flexible, but slower.
  • 🔧 Use POD records to unlock HDF5’s full performance potential.

If performance is paramount, lock in fixed sizes and let your data pipeline fly.

Bridging HPC Structs and HDF5 COMPOUNDs with H5CPP

🚀 The Problem

You’re running simulations or doing scientific computing. You model data like this:

struct record_t {
    double temp;
    double density;
    double B[3];
    double V[3];
    double dm[20];
    double jkq[9];
};
````

Now you want to persist these structs into an HDF5 file using the COMPOUND datatype.With the standard C API? That means 20+ lines of verbose, error-prone setup. With H5CPP? Just include `struct.h` and let the tools handle the rest.


## 🔧 Step-by-Step with H5CPP
### 1. Define Your POD Struct

```cpp
namespace sn {
    struct record_t {
        double temp;
        double density;
        double B[3];
        double V[3];
        double dm[20];
        double jkq[9];
    };
}

2. Generate Type Descriptors

Invoke the H5CPP LLVM-based code generator:

h5cpp struct.cpp -- -std=c++17 -I. -Dgenerated.h

It will emit a generated.h file that defines a specialization for:

h5::register_struct<sn::record_t>()

This registers an HDF5 compound type at runtime, automatically.

🧪 Example Usage

Here’s how you write/read a compound dataset with zero HDF5 ceremony:

#include "struct.h"
#include <h5cpp/core>
#include "generated.h"
#include <h5cpp/io>

int main(){
    h5::fd_t fd = h5::create("test.h5", H5F_ACC_TRUNC);

    // Create dataset with shape (70, 3, 3)
    h5::create<sn::record_t>(fd, "/Module/g_data", h5::max_dims{70, 3, 3});

    // Read it back
    auto records = h5::read<std::vector<sn::record_t>>(fd, "/Module/g_data");
    for (auto rec : records)
        std::cerr << rec.temp << " ";
    std::cerr << std::endl;
}

🔍 What generated.h Looks Like

The generated descriptor maps your struct fields to HDF5 types:

template<> hid_t inline register_struct<sn::record_t>() {
    hid_t ct = H5Tcreate(H5T_COMPOUND, sizeof(sn::record_t));
    H5Tinsert(ct, "temp", HOFFSET(sn::record_t, temp), H5T_NATIVE_DOUBLE);
    H5Tinsert(ct, "density", HOFFSET(sn::record_t, density), H5T_NATIVE_DOUBLE);
    ...
    return ct;
}
Nested arrays (like B[3]) are flattened using H5Tarray_create, and all internal hid_t handles are cleaned up.

🧵 Thread-safe and Leak-free

Generated code avoids resource leaks by closing array types after insertion, keeping everything safe and clean:

H5Tclose(at_00); H5Tclose(at_01); H5Tclose(at_02);

🧠 Why This Matters

HDF5 is excellent for structured scientific data. But the C API is boilerplate-heavy and distracts from the real logic. H5CPP eliminates this:

  • Describe once, reuse everywhere
  • Autogenerate glue code
  • Zero-copy semantics, modern C++17 syntax
  • Support for nested arrays and multidimensional shapes

✅ Conclusion

If you're working with scientific data in C++, H5CPP gives you the power of HDF5 with the simplicity of a header file. Skip the boilerplate. Focus on science.