Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for owner() #234

Merged
merged 2 commits into from
Oct 18, 2024
Merged

Add support for owner() #234

merged 2 commits into from
Oct 18, 2024

Conversation

hagenw
Copy link
Member

@hagenw hagenw commented Sep 30, 2024

This is not based on main, but expands the minio branch.

MinIO seems not directly to support the concept of a file owner. At least if you call their stats_object() method and look at its attribute owner_name it is always empty for me.

This pull request adds a workaround, by setting the owner explicitly as a metadata entry, every time a file is uploaded or copied to the server. The owner can then be extracted with stats_object().metadata. I benchmarked, and the proposed changes have no affect on the execution time of put_file().


We should ask ourselves, if we really want to have this feature or not. If we later switch to fsspec for backends (#233), we will again have the problem, that owner is not supported, and we will most likely not be able to add a workaround that is independent from the backend. So maybe we should just drop it?

The challenge is that it provides very meaningful information, e.g. who published a given model or data, which means if we drop it from audbackend, we will most likely need to store those information inside other metadata, e.g. in the header of a dataset.

@hagenw
Copy link
Member Author

hagenw commented Oct 1, 2024

I would vote to support owner() in the MinIO backend (#231) we introduce with the next release of audbackend.

We can then start in parallel to update our packages like audb and audmodel, to no longer rely on audbackend.Interface.Base.owner(), but store the information in their own metadata. When we then target fsspec support (#233) in the next mayor release of audbackend, we can then drop owner() in audbackend.

@@ -398,3 +405,13 @@ def _size(
path = self.path(path)
size = self._client.stat_object(self.repository, path).size
return size


def _metadata():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to maintain _metadata completely separate from the Minio class?
Intuitively one would think of _metadata as a (private) property. On the other hand, the class needs to implement the interface methods from the parent class, and a property might become confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to implement such helper functions independent of the class. My main reason is that in the class we only have the methods that needed to be added, and it does not get cluttered. But maybe it has also some drawbacks?

Copy link
Member

@ChristianGeng ChristianGeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem solved in this MR

This solves the problem that in the current state of our minio deployments the owner_name is always empty. The particular workaround applied fetches the owner attribute from the from the minio client itself.

This is a good working option in the present scenario. The owner field in self._client.stat_object(self.repository, path).owner_name is empty when using minio's internal access management. In case that that a switch to an oidc external identity provider the owner field might contain values that then would become the preferrable option, but for now the implementation allows to track the user in a sufficient fashion.

So in sum I think that conceptually this is ready for approval.

@hagenw hagenw merged commit 371cc8c into minio Oct 18, 2024
@hagenw hagenw deleted the minio-with-owner branch October 18, 2024 17:41
hagenw added a commit that referenced this pull request Oct 30, 2024
* Add audbackend.backend.MinIO

* Add dependency to minio

* Rename MinIO to Minio

* Fix Minio.delete() to first remove objects

* Let Minio.exists() return True for "/"

* Fix Minio.ls() for non-existing paths

* Add tests using a local MinIO server

* Expand tests to 100% code coverage

* Fix remaining issues

* Update docstrings

* Define constant hosts values at single position

* Run tests on play.min.io

* Be more conservative for other copy method

* Add Minio.get_config()

* Add docstring example

* Correct get_config() docstring

* Add secure argument

* Add possibility to provide **kwargs

* Set content-type during upload

* Add **kwargs to docstring

* Add support for owner() (#234)

* Add support for owner()

* Be more conservative regarding owner

* Fix owner test under Windows

* Revert "Fix owner test under Windows"

This reverts commit 643c85d.

* Try to fix Windows owner test

* Replace _close() with close()

* Limit copy size to 4.9 GB

* Ensure errors are tested for MinIO

* Add docstring to test_get_config

* Add docstring to test_maven_file_structure

* Ground truth var for interfaces in _unversioned.py

* Ground truth var for interfaces in _versioned.py

* Ground truth var for interfaces in _maven.py

* Fix typos

* Update get_file test

* Fix typing of _size()

* Remove default None in dict.get()

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

* Improve code quality

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

* Add comment for MinIO credentials

* Improve code quality

* Add exception in _get_file()

* Fix coverage

* DEBUG: coverage

* Try to fix coverage

---------

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants