You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using the Metaflow UI the stdout/stderr panes no longer successfully load, and the requests to load them return with a 504 gateway timeout.
Example url being requested by UI for stderr logs:
/api/flows/<flow_name>/runs/59510/steps/start/tasks/539228/logs/err?attempt_id=0&_limit=500&_page=1&_order=-row
I believe the issue is caused by a very expensive join query in async def get_task_by_request(self, request): in ui_backend_service/api/log.py. Looking at the code, this function call and underlying join query seems unnecessary given that the UI is already passing all the task parameters necessary to uniquely identify the task in the Task table directly, including attempt.
The text was updated successfully, but these errors were encountered:
@saikonen@savingoyal My initial belief was incorrect. This was actually caused by the log CacheAsyncClient and/or CacheAsyncServer getting into a bad state where it would internally fetch the logs but never return them, leading to the list of pending streams continually building. A restart of the ui_backend service resolved the issue.
the latest release contains some fixes regarding log handling, especially for large logs. There have been a few other fixes along the way as well, Is this issue still relevant?
When using the Metaflow UI the stdout/stderr panes no longer successfully load, and the requests to load them return with a 504 gateway timeout.
Example url being requested by UI for stderr logs:
/api/flows/<flow_name>/runs/59510/steps/start/tasks/539228/logs/err?attempt_id=0&_limit=500&_page=1&_order=-row
I believe the issue is caused by a very expensive join query in
async def get_task_by_request(self, request):
in ui_backend_service/api/log.py. Looking at the code, this function call and underlying join query seems unnecessary given that the UI is already passing all the task parameters necessary to uniquely identify the task in the Task table directly, including attempt.The text was updated successfully, but these errors were encountered: