Skip to content

Conversation

rilshok
Copy link

@rilshok rilshok commented Aug 23, 2025

When using S3FileSystem.ls or S3FileSystem.listdir, zero-sized objects that serve as directory markers in S3 were incorrectly listed as files.

Amazon S3 employs a flat object storage model, but its console interface allows users to create "folders" by adding zero-size objects with keys ending in a delimiter (e.g., myfolder/). These objects act as placeholders to visually represent directories but are not actual files.

This PR implements a check to skip zero-size objects that match the current directory prefix during listing operations.

@martindurant
Copy link
Member

Thanks for submitting. I will comment more on Monday and after the tests have run. I will say, that there is extensive conversation around this topic already in this repo, and that such a change will need extensive testing. I also don't see any reference to terminating "/", but I only took a 2min look.

@martindurant
Copy link
Member

I am surprised to see only one failure! Of course, the tests do NOT tend to make placeholder files, since we know they are problematic and unnecessary.

I would expect, that bulk rm may fail in the case of placeholders, since the files deleted will be those returned by find(), and not include them - so only normal files will be deleted and not the "directories", leaving the tree structure erroniously in place. This should be explicitly tested.

@rilshok
Copy link
Author

rilshok commented Aug 27, 2025

Yes, you are right that this scenario needs to be considered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants