-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Python] File staging to user worker support #34208
Conversation
Assigning reviewers. If you would like to opt out of this review, comment R: @liferoad for label python. Available commands:
The PR bot will only process comments in the main thread (not review comments). |
I think this might have the similar issues for FlinkRunner: #32743 |
Here I want to plumb staging arbitrary files using stager and documenting how it should be read on portability. I guess for other runners there has to be some future work to make those files available under environment.get('SEMI_PERSISTENT_DIRECTORY', None) + '/staged') path. |
I mean this might not work for other runners. Can you test this with Dataflow to confirm this works? |
Sure. This works for python sdk, runner V2. I'm trying to test in with expansion service (no env variable for sure but I want to check if the stage directory with those files is there). |
Thanks for this change! I am seeing some errors in precommit tests:
|
The logs are a bit painful to dig through, but seeing this:
|
thanks, will take a look at it! |
Thanks! |
* file staging * add tests * help message * format * typo * yapf * docs * support for multiple files * changes.md * fix test
* file staging * add tests * help message * format * typo * yapf * docs * support for multiple files * changes.md * fix test
Add files to stage flag for python sdk to support uploading arbitrary files to user worker.
Avoiding modification of container boot.
Files are staged to location that depends on runner (for python it's:
semi_persistent_directory = environment.get('SEMI_PERSISTENT_DIRECTORY', None) + '/staged'
)