Return just `data` when loading `.topostats` #109

ns-rse · 2025-02-03T14:54:20Z

Described in the pull to add a grains entry point the structure of a topostats object during processing with topostats is different from that which is returned by AFMReader.topostats.load_topostats().

At the end of processing a topostats object is a nested dictionary of (mostly) Numpy arrays which is written to HDF5 as is.

However, on loading with AFMReader.topostats.load_topostats() this dictionary is loaded and two items extracted from the data dictionary (which is the structure as written to HDF5) and a tuple returned.

This can be seen from line 42 onwards where the flattened image is extracted along with pixel_to_nm_scaling and returned along with the complete data as a tuple.

    try:
        with h5py.File(file_path, "r") as f:
            data = unpack_hdf5(open_hdf5_file=f, group_path="/")
            if data["topostats_file_version"] >= 0.2:
                data["img_path"] = Path(data["img_path"])
            file_version = data["topostats_file_version"]
            logger.info(f"[{filename}] TopoStats file version : {file_version}")
            image = data["image"]
            pixel_to_nm_scaling = data["pixel_to_nm_scaling"]

    except OSError as e:
        if "Unable to open file" in str(e):
            logger.error(f"[{filename}] File not found : {file_path}")
        raise e

    return (image, pixel_to_nm_scaling, data)

It would make more sense if a topostats object were the same structure, whether it is created during processing or loaded using AFMReader. I believe the current situation is for "convenience" but extraction of either parameter if only the data were returned would not be wholly different, i.e. referring to the required element of a dictionary by name.

Suggest removing this and having instead...

    try:
        with h5py.File(file_path, "r") as f:
            data = unpack_hdf5(open_hdf5_file=f, group_path="/")
            if data["topostats_file_version"] >= 0.2:
                data["img_path"] = Path(data["img_path"])
            file_version = data["topostats_file_version"]
            logger.info(f"[{filename}] TopoStats file version : {file_version}")

    except OSError as e:
        if "Unable to open file" in str(e):
            logger.error(f"[{filename}] File not found : {file_path}")
        raise e

    return data

The text was updated successfully, but these errors were encountered:

SylviaWhittle · 2025-02-04T10:18:48Z

I'm tentatively in favour of the proposed change there, I am trying to think of any issues but can't come up with any, though it seems like the kind of thing that might bring up unforeseen complications 😅

Adding all the data that is produced during processing would be good.

Do you think I should add the image mask tensor to the .topostats file in my PR or leave it for its own PR?

Closes #109 Rather than return the HDF5 object loaded from `.topostats` as a dictionary as the third item in a tuple along with the `image` and `pixel_to_nm_scaling` (both of which are extracted from the loaded data anyway) just the dictionary is returned. This matches the structure of objects created by TopoStats and how they are saved to HDF5 files.

ns-rse · 2025-02-06T14:44:52Z

There will always be something we can't anticipate!

I'd leave the image mask tensor as a separate PR, grains refactor is a behemoth already!

ns-rse self-assigned this Feb 3, 2025

ns-rse mentioned this issue Feb 3, 2025

Update docstring for load_topostats.py #81

Closed

ns-rse added the v0.1.0 label Feb 4, 2025

ns-rse mentioned this issue Feb 4, 2025

feature: Returns just dictionary when loading .topostats #112

Merged

ns-rse closed this as completed in #112 Feb 4, 2025

ns-rse added this to the v0.1.0 : Expand supported file formats milestone Feb 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return just `data` when loading `.topostats` #109

Return just `data` when loading `.topostats` #109

ns-rse commented Feb 3, 2025

SylviaWhittle commented Feb 4, 2025

ns-rse commented Feb 6, 2025

Return just data when loading .topostats #109

Return just data when loading .topostats #109

Comments

ns-rse commented Feb 3, 2025

SylviaWhittle commented Feb 4, 2025

ns-rse commented Feb 6, 2025

Return just `data` when loading `.topostats` #109

Return just `data` when loading `.topostats` #109