Module epiclass.utils.hdf5_to_float32

This module provides functionalities to copy HDF5 files to a new directory, convert their datasets to float32 data type, and repack them to reduce their sizes.

Functions

def cast_datasets_to_float32(file_path: Path) ‑> bool

Casts all the datasets in an HDF5 file to float32 data type.

def copy_hdf5_file(file_path: Path, logdir: Path)

Copies an HDF5 file to a new location, appending "_float32.hdf5" to the filename.

def main()

Main function that parses command-line arguments and performs the operations to copy and cast HDF5 files.

def parse_arguments() ‑> argparse.Namespace

argument parser for command line

def process_file(hdf5_file: Path, logdir: Path) ‑> None

Processes an HDF5 file by copying it to a new location, casting its datasets to float32 data type, and repacking it.

The function first attempts to copy the input file to a new location by appending "_float32.hdf5" to the filename. If the new file already exists, the function logs a warning and returns.

If the new file is successfully created, the function casts all the datasets in the file to float32 data type, and logs any big difference between the original and casted datasets. The function then repacks the file to reduce its size.

If any error occurs during the process, the function logs the error message and traceback, and skips the current file.

Args

hdf5_file : Path
The absolute path to the input HDF5 file.
logdir : Path
The directory where the new file will be created.

Returns

None

def repack_hdf5_file(file_path: Path) ‑> None

Repacks an HDF5 file to reduce its size. Uses the h5repack command line tool.