feature: massive logs streaming #424

saikonen · 2024-06-05T15:35:56Z

Alternative approach to streaming massive logs. Does not require any changes to Metaflow client side.

Intended to supersede #423

TODO:

support legacy logs
reliable cache/GC of the raw log downloads on disk so they can be reused across cache action calls.

romain-intel

A few comments -- i'll follow up offline

romain-intel · 2024-06-11T06:44:36Z

services/ui_backend_service/data/cache/get_log_file_action.py

-                content = log_provider.get_log_content(task, logtype)
-                results[log_key] = json.dumps({"log_hash": current_hash, "content": content})
+                total_lines = count_total_lines(local_paths)
+                results[log_key] = json.dumps({"log_hash": current_hash, "content_paths": local_paths, "line_count": total_lines})


Mostly for my education but why json.dumps? We seem to then be doing a lot of loads. Is this serialized soemwhere outside the process?

services/ui_backend_service/data/cache/get_log_file_action.py

romain-intel · 2024-06-11T06:59:56Z

services/ui_backend_service/data/cache/get_log_file_action.py

+            *task.path_components,
+            attempt
+        )
+        paths = dict(


get the non _ method get_log_location not be used here (am always a little weary of using internal methods)

romain-intel · 2024-06-11T07:12:01Z

services/ui_backend_service/data/cache/get_log_file_action.py

+                    log_paths[name] = None
+                else:
+                    shutil.move(path, log_paths[name])
+    return [val for val in log_paths.values() if val is not None]


I wonder if this whole method could just be simplified by using the task datastore's load_logs (well through the filecache if you want to although not 100% sure) with the addition of something like "write_to_disk_only". In other words, you seem to be doing a lot of the thing from there with the only difference being that you don't open the logs but just keep them on disk. I feel we could make that small change there (without modifying any of the storage layers but just keeping it at the task datastore level) and this should simplify this code a lot. Basically, your line 280 (or somethin akin to it) would go around line 984 of task_datastore.py (protected by a flag or something).

yep, as a follow up will look into hoisting all the logic required by fetch_logs into the Metaflow client side, for something akin to dump_logs(stream, destination_path)

saikonen added 6 commits May 30, 2024 18:07

wip: testing

0b88447

add log streaming download to api

162dd08

local dev improvements

5269933

wip: first working draft of log streaming

f6b6b71

implement naive log reverse

8d3c91a

skip fetch if log files exist on disk.

f429e4d

saikonen requested review from savingoyal and romain-intel June 5, 2024 15:35

saikonen added 6 commits June 6, 2024 00:59

add support for legacy logs

c608c58

move log dump temp folder under cache_data

43710f7

fix page count

98fa93a

fix raw log stitching

c1dff94

lint

28bc2a6

fix tests

1800648

romain-intel reviewed Jun 11, 2024

View reviewed changes

saikonen added 4 commits June 12, 2024 18:11

add GC to log blob downloads

a352e4d

styles

3fca55e

fix handling logs for running tasks

7da2935

cleanup

06950fd

saikonen merged commit 87ef097 into master Jun 12, 2024
5 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature: massive logs streaming #424

feature: massive logs streaming #424

saikonen commented Jun 5, 2024 •

edited

Loading

romain-intel left a comment

romain-intel Jun 11, 2024

romain-intel Jun 11, 2024

romain-intel Jun 11, 2024

saikonen Jun 12, 2024

feature: massive logs streaming #424

feature: massive logs streaming #424

Conversation

saikonen commented Jun 5, 2024 • edited Loading

romain-intel left a comment

Choose a reason for hiding this comment

romain-intel Jun 11, 2024

Choose a reason for hiding this comment

romain-intel Jun 11, 2024

Choose a reason for hiding this comment

romain-intel Jun 11, 2024

Choose a reason for hiding this comment

saikonen Jun 12, 2024

Choose a reason for hiding this comment

saikonen commented Jun 5, 2024 •

edited

Loading