Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

all_files method goes away in Arvados 3 #41

Open
golharam opened this issue Nov 19, 2024 · 0 comments
Open

all_files method goes away in Arvados 3 #41

golharam opened this issue Nov 19, 2024 · 0 comments

Comments

@golharam
Copy link

golharam commented Nov 19, 2024

file_list = the_col.all_files()

Hi Ryan,

Yes, the "all_files()" method was intentionally in 3.0.0, as it depended on several legacy methods and classes that were also deprecated and removed.

Here is a possible replacement implementation you can use:

import arvados.collection
import collections
import pathlib

def all_files(root_collection):
    """all_files yields tuples of (collection path, file object) for
    each file in the collection."""

    stream_queue = collections.deque([pathlib.PurePosixPath('.')])
    while stream_queue:
        stream_path = stream_queue.popleft()
        subcollection = root_collection.find(str(stream_path))
        for name, item in subcollection.items():
            if isinstance(item, arvados.arvfile.ArvadosFile):
                yield (stream_path / name, item)
            else:
                stream_queue.append(stream_path / name)

# example usage

root_collection = arvados.collection.Collection("723ccb12f21518b2fe936ec68d2d5927+2900")

for fn in all_files(root_collection):
    print(fn)

There is also a code snippet in the cookbook for walking over the files in a collection:

https://doc.arvados.org/v3.0/sdk/python/cookbook.html#walk-collection

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant