Skip to content

HDF5 and `long double`: Precision Stored, Precision Misread

When working with scientific simulations, precision matters. Many codes rely on long double to squeeze out a few more digits of accuracy. The good news: HDF5 supports H5T_NATIVE_LDOUBLE natively, and with H5CPP you can write and read long double seamlessly.

The bad news: h5dump, the standard HDF5 inspection tool, stumbles. Instead of your carefully written values, you’ll often see tiny denormalized numbers (4.94066e-324) or other junk. This isn’t corruption—it’s just h5dump misinterpreting extended precision types.

Minimal Example

Consider the following snippet:

#include "h5cpp/all"
#include <vector>

int main() {
    std::vector<long double> x{0.0L, 0.01L, 0.02L, 0.03L, 0.04L,
                               0.05L, 0.06L, 0.07L, 0.08L, 0.09L};

    h5::fd_t fd = h5::create("test.h5", H5F_ACC_TRUNC);
    h5::ds_t ds = h5::create<long double>(fd, "homogenious", h5::current_dims{5,3}, h5::chunk{1,3});
    h5::write(ds, x);
}
Running the code and dumping with h5dump:

h5dump -d /homogenious test.h5

DATA {
(0,0): 4.94066e-324, 4.94066e-324, 4.94066e-324,
...
}

Looks broken, right? But if you read back the dataset with HDF5 or H5CPP:

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

…the values are correct. The underlying file is perfectly fine.

Why the Mismatch?

h5dump uses its own format string and assumes a particular binary layout for floating-point numbers. On many systems, long double is 80-bit extended precision or 128-bit quad precision, which doesn’t map cleanly to the dumper’s print logic. Hence the nonsense output.

In other words: the storage layer is solid, but the diagnostic tool lags behind.

Compound Types with long double

HDF5 compound types with H5T_NATIVE_LDOUBLE also work, including arrays of extended-precision fields:

DATASET "stream-of-records" {
   DATATYPE  H5T_COMPOUND {
      H5T_NATIVE_LDOUBLE "temp";
      H5T_NATIVE_LDOUBLE "density";
      H5T_ARRAY { [3] H5T_NATIVE_LDOUBLE } "B";
      H5T_ARRAY { [3] H5T_NATIVE_LDOUBLE } "V";
      ...
   }
   DATASPACE SIMPLE { ( 10 ) / ( H5S_UNLIMITED ) }
   STORAGE_LAYOUT { CHUNKED ( 512 ) }
   FILTERS { COMPRESSION DEFLATE { LEVEL 9 } }
}

Here too, h5dump shows garbage, but reading with HDF5 APIs returns the expected values.

Takeaway

  • Write long double safely with HDF5/H5CPP.
  • Read long double safely with HDF5/H5CPP.
  • Don’t trust h5dump for inspecting long double datasets.