-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for owner() #234
Conversation
a102060
to
64c9abb
Compare
I would vote to support We can then start in parallel to update our packages like |
@@ -398,3 +405,13 @@ def _size( | |||
path = self.path(path) | |||
size = self._client.stat_object(self.repository, path).size | |||
return size | |||
|
|||
|
|||
def _metadata(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to maintain _metadata
completely separate from the Minio
class?
Intuitively one would think of _metadata
as a (private) property. On the other hand, the class needs to implement the interface methods from the parent class, and a property might become confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to implement such helper functions independent of the class. My main reason is that in the class we only have the methods that needed to be added, and it does not get cluttered. But maybe it has also some drawbacks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Problem solved in this MR
This solves the problem that in the current state of our minio deployments the owner_name
is always empty. The particular workaround applied fetches the owner attribute from the from the minio client itself.
This is a good working option in the present scenario. The owner field in self._client.stat_object(self.repository, path).owner_name
is empty when using minio's internal access management. In case that that a switch to an oidc external identity provider the owner field might contain values that then would become the preferrable option, but for now the implementation allows to track the user in a sufficient fashion.
So in sum I think that conceptually this is ready for approval.
* Add audbackend.backend.MinIO * Add dependency to minio * Rename MinIO to Minio * Fix Minio.delete() to first remove objects * Let Minio.exists() return True for "/" * Fix Minio.ls() for non-existing paths * Add tests using a local MinIO server * Expand tests to 100% code coverage * Fix remaining issues * Update docstrings * Define constant hosts values at single position * Run tests on play.min.io * Be more conservative for other copy method * Add Minio.get_config() * Add docstring example * Correct get_config() docstring * Add secure argument * Add possibility to provide **kwargs * Set content-type during upload * Add **kwargs to docstring * Add support for owner() (#234) * Add support for owner() * Be more conservative regarding owner * Fix owner test under Windows * Revert "Fix owner test under Windows" This reverts commit 643c85d. * Try to fix Windows owner test * Replace _close() with close() * Limit copy size to 4.9 GB * Ensure errors are tested for MinIO * Add docstring to test_get_config * Add docstring to test_maven_file_structure * Ground truth var for interfaces in _unversioned.py * Ground truth var for interfaces in _versioned.py * Ground truth var for interfaces in _maven.py * Fix typos * Update get_file test * Fix typing of _size() * Remove default None in dict.get() Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com> * Improve code quality Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com> * Add comment for MinIO credentials * Improve code quality * Add exception in _get_file() * Fix coverage * DEBUG: coverage * Try to fix coverage --------- Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
This is not based on
main
, but expands theminio
branch.MinIO seems not directly to support the concept of a file owner. At least if you call their
stats_object()
method and look at its attributeowner_name
it is always empty for me.This pull request adds a workaround, by setting the owner explicitly as a metadata entry, every time a file is uploaded or copied to the server. The owner can then be extracted with
stats_object().metadata
. I benchmarked, and the proposed changes have no affect on the execution time ofput_file()
.We should ask ourselves, if we really want to have this feature or not. If we later switch to
fsspec
for backends (#233), we will again have the problem, thatowner
is not supported, and we will most likely not be able to add a workaround that is independent from the backend. So maybe we should just drop it?The challenge is that it provides very meaningful information, e.g. who published a given model or data, which means if we drop it from
audbackend
, we will most likely need to store those information inside other metadata, e.g. in the header of a dataset.