Skip to content

Custom Floating-Point and Opaque Types in HDF5

Extended precision floating-point (long double) is a common headache in data persistence. While HDF5 does support H5T_NATIVE_LDOUBLE, the inspection tools (h5dump) often misreport the stored numbers. Fortunately, H5CPP allows you to define custom datatypes—falling back to opaque storage when necessary.

Custom Type Definition

A specialization with H5T_OPAQUE lets you capture the raw 80-bit (or 128-bit) layout without worrying about architecture quirks:

namespace h5::impl::detail {
    template <>
    struct hid_t<opaque::ldouble_t, H5Tclose, true, true, hdf5::type>
        : public dt_p<opaque::ldouble_t> {
        using parent = dt_p<opaque::ldouble_t>;
        using parent::hid_t;
        using hidtype = opaque::ldouble_t;

        hid_t() : parent(H5Tcreate(H5T_OPAQUE, sizeof(opaque::ldouble_t))) {
            hid_t id = static_cast<hid_t>(*this);
        }
    };
}

This ensures your values are faithfully written and retrievable—even if the dumper chokes on them.

Example Output

A dataset written as H5T_NATIVE_LDOUBLE might display as garbage in h5dump:

DATASET "custom" {
   DATATYPE  H5T_NATIVE_LDOUBLE
   DATA {
   (0): 4.94066e-324, 4.94066e-324, ...
   }
}

…but the opaque fallback shows the raw byte patterns:

DATASET "opaque" {
   DATATYPE H5T_OPAQUE { OPAQUE_TAG "" }
   DATA {
   (0): 59:16:f5:f3:bb:e2:28:b8:01:40:00:00:00:00:00:00,
   (1): 21:93:2c:c5:cc:f5:5b:90:00:40:00:00:00:00:00:00,
   ...
   }
}

Why Two Views?

  • H5T_NATIVE_LDOUBLE: portable but misprinted by h5dump.
  • H5T_OPAQUE: exact bytes preserved, great for debugging or custom parsers.

On AMD64 systems, long double is stored in 16 bytes but only the first 10 bytes are significant. The last 6 are tail padding with undefined contents. This is why treating the type as opaque makes sense when fidelity is critical.

Beyond Long Double

You’re not limited to long double. With H5CPP you can adapt the same approach to:

  • half precision (float16)
  • nbit packed integers
  • arbitrary bit-level encodings

See the H5CPP examples for twobit, nbit, and half-float.

Takeaway

  • ✅ Use H5T_NATIVE_LDOUBLE when you want logical portability.
  • ✅ Wrap as OPAQUE when you need raw fidelity and control.
  • ⚠️ Don’t panic when h5dump shows nonsense—the data is safe.

With H5CPP, you get the flexibility to represent any custom precision format—whether for simulation accuracy, bit-packed encodings, or raw experimental data.