Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return just data when loading .topostats #109

Closed
ns-rse opened this issue Feb 3, 2025 · 2 comments · Fixed by #112
Closed

Return just data when loading .topostats #109

ns-rse opened this issue Feb 3, 2025 · 2 comments · Fixed by #112
Assignees
Labels

Comments

@ns-rse
Copy link
Collaborator

ns-rse commented Feb 3, 2025

Described in the pull to add a grains entry point the structure of a topostats object during processing with topostats is different from that which is returned by AFMReader.topostats.load_topostats().

At the end of processing a topostats object is a nested dictionary of (mostly) Numpy arrays which is written to HDF5 as is.

However, on loading with AFMReader.topostats.load_topostats() this dictionary is loaded and two items extracted from the data dictionary (which is the structure as written to HDF5) and a tuple returned.

This can be seen from line 42 onwards where the flattened image is extracted along with pixel_to_nm_scaling and returned along with the complete data as a tuple.

    try:
        with h5py.File(file_path, "r") as f:
            data = unpack_hdf5(open_hdf5_file=f, group_path="/")
            if data["topostats_file_version"] >= 0.2:
                data["img_path"] = Path(data["img_path"])
            file_version = data["topostats_file_version"]
            logger.info(f"[{filename}] TopoStats file version : {file_version}")
            image = data["image"]
            pixel_to_nm_scaling = data["pixel_to_nm_scaling"]

    except OSError as e:
        if "Unable to open file" in str(e):
            logger.error(f"[{filename}] File not found : {file_path}")
        raise e

    return (image, pixel_to_nm_scaling, data)

It would make more sense if a topostats object were the same structure, whether it is created during processing or loaded using AFMReader. I believe the current situation is for "convenience" but extraction of either parameter if only the data were returned would not be wholly different, i.e. referring to the required element of a dictionary by name.

Suggest removing this and having instead...

    try:
        with h5py.File(file_path, "r") as f:
            data = unpack_hdf5(open_hdf5_file=f, group_path="/")
            if data["topostats_file_version"] >= 0.2:
                data["img_path"] = Path(data["img_path"])
            file_version = data["topostats_file_version"]
            logger.info(f"[{filename}] TopoStats file version : {file_version}")

    except OSError as e:
        if "Unable to open file" in str(e):
            logger.error(f"[{filename}] File not found : {file_path}")
        raise e

    return data
@SylviaWhittle
Copy link
Collaborator

I'm tentatively in favour of the proposed change there, I am trying to think of any issues but can't come up with any, though it seems like the kind of thing that might bring up unforeseen complications 😅

Adding all the data that is produced during processing would be good.

Do you think I should add the image mask tensor to the .topostats file in my PR or leave it for its own PR?

ns-rse added a commit that referenced this issue Feb 4, 2025
Closes #109

Rather than return the HDF5 object loaded from `.topostats` as a dictionary as the third item in a tuple along with the
`image` and `pixel_to_nm_scaling` (both of which are extracted from the loaded data anyway) just the dictionary is
returned. This matches the structure of objects created by TopoStats and how they are saved to HDF5 files.
@ns-rse
Copy link
Collaborator Author

ns-rse commented Feb 6, 2025

There will always be something we can't anticipate!

I'd leave the image mask tensor as a separate PR, grains refactor is a behemoth already!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants