Review zipping tool #2215

tcompa · 2025-01-27T08:38:10Z

Context:

In a specific instance, we recently observed a clear correlation of
- The zipping of a somewhat large job folder (22k files, 180M unzipped)
- A worker timeout (which means the worker did not update its corresponding tmp file)
In the past, we observed some (non-reproducible) error upon zipping a large job folder - ref Failure in zipping large folder (9k images, 900M) #1955.
Our zipping tool is currently based on Python zipfile
When working on the SSH runner (which also involves creating compressed folders to be sent back and forth), we moved from Python tarfile to a subprocess.run wrap of native tools (that is, tar).
- We observed that the Python tarfile library was quite heavier than native tools (that is, tar), and attributed at least part of this difference to the large number of additional os.stat calls. Note that this difference became more evident due to our environment (a CEPH filesystem with a cold cache).
- For the SSH runner, we then moved from tarfile to tar, see e.g. 1596 review use of tartarfile in compress folderpy module #1641.

We don't have a controlled way of reproducing the issue in point 1, but all the items above suggest that we should also move the current zipfile-based zipping operations to subprocess.run wraps of gzip. In the worst case, we are just improving the performance of one operation. In the best case, we are fixing the issue at point 1.

The text was updated successfully, but these errors were encountered:

tcompa · 2025-01-27T09:29:48Z

Minor update in terms of finding the right explanation:
The zipping operation is likely blocking. If this is the case, that worker cannot perform other tasks (e.g. sending an hearth beat) for the time it takes to complete the zip operation.
(ref #1507)

Mitigation 1 is clearly "make it faster".
Mitigation 2 (if needed) is to also make it async.

jluethi · 2025-01-27T09:54:00Z

The zipping operation is likely blocking

Oh, that's relevant for sure. Yes, making this an non-blocking operation in an async approach would be very valuable to ensure we don't run into worker timeout issues for unexpected scaling (larger data provenance for some reason, a job with much larger than usual logs, more images being processed etc.)

tcompa · 2025-01-27T14:33:12Z

In the current concurrency model (see #2218), we need to be careful about making the zipping function async and non-blocking.

Here are three examples of how to run a bash command:

subprocess.run running from threads --> non-blocking
async function that calls subprocess.run running in event loop --> blocking
asyncio.create_subprocess_exec running in event loop --> non-blocking

We are now close to situation 2 (we don't run subprocess.run, but other blocking functions), and then the whole zipping operation is blocking.
We will move to 3 right away.
Later, when fixing #2218, we will even simplify things by moving to 1.

from concurrent.futures import ThreadPoolExecutor
import subprocess
import shlex
import time
import asyncio


def function(sleeptime):
    subprocess.run(
        shlex.split(f"sleep {sleeptime}"),
        capture_output=True,
        check=True,
    )


async def function_async_subprocess(sleeptime):
    subprocess.run(
        shlex.split(f"sleep {sleeptime}"),
        capture_output=True,
        check=True,
    )


async def function_async_create_subprocess_exec(sleeptime):
    proc = await asyncio.create_subprocess_exec(*shlex.split(f"sleep {sleeptime}"))
    stdout, stderr = await proc.communicate()
    print()


async def run_it_all_subprocess(sleeptime):
    task1 = asyncio.create_task(function_async_subprocess(sleeptime))
    task2 = asyncio.create_task(function_async_subprocess(sleeptime))
    await task1
    await task2


async def run_it_all_create_subprocess_exec(sleeptime):
    task1 = asyncio.create_task(function_async_create_subprocess_exec(sleeptime))
    task2 = asyncio.create_task(function_async_create_subprocess_exec(sleeptime))
    await task1
    await task2


if __name__ == "__main__":
    print("BLOCK 1")
    t0 = time.perf_counter()
    sleeptime = 1
    with ThreadPoolExecutor(max_workers=10) as executor:
        fut1 = executor.submit(function, sleeptime)
        fut2 = executor.submit(function, sleeptime)

    t1 = time.perf_counter()
    print(f"{t1-t0}")
    assert abs(t1 - t0 - sleeptime) < 0.1
    print()

    print("BLOCK 2")
    t0 = time.perf_counter()
    asyncio.run(run_it_all_subprocess(sleeptime))
    t1 = time.perf_counter()
    print(f"{t1-t0}")
    print()

    print("BLOCK 3")
    t0 = time.perf_counter()
    asyncio.run(run_it_all_create_subprocess_exec(sleeptime))
    t1 = time.perf_counter()
    print(f"{t1-t0}")
    assert abs(t1 - t0 - sleeptime) < 0.1
    print()

tcompa · 2025-01-27T14:45:05Z

Question cc @jluethi

Right now we can download the zipped job folders in two ways:

After the job is complete, we would download the zipped version of the final job folder (the one that is also kept on disk).
During job execution, we could only download a partial zipped version of the job folder (e.g. it may include only some images, or only some tasks).

Use case 1 is obviously critical.
Is use case 2 also relevant?

If not, we could simplify a bit the implementation by removing an additional branch that has to create a temporary version of the zipped folder.
If yes, then the best option is likely to write the partial zip file to a tmp folder before streaming it through the API.

jluethi · 2025-01-27T16:37:05Z

I agree that use case 1 is more critical than use case 2, but use case 2 can be helpful sometimes. There are 2 things that make me hesitant on the conclusion here:
a. It's convenient to have that option to see logs of what is currently running to check whether things are running nicely in long-running jobs
b. We may want to work on how we expose these logs anyway as part of flexibility + provenance: It would be very convenient to have individual, web-based access to job logs, instead of just the zip download.

a) for me argues that maybe we want to refactor to the partial zip being put in the tmp folder etc. But b) argues that the need to download full zips in intermediate states would go away, so we shouldn't invest too much into it.
Mid term, I'm fine if we drop support for use case 2, if we expose individual logs in the web interface in an ok manner.

tcompa · 2025-01-28T10:42:55Z

Status update (with @mfranzon, as always..):

We checked that switching to a different implementation (from Python zipfile to the zip command) does not immediately lead to a significant speedup - ref Compare zipping performance of zipfile and zip #2221. This can be explored further, but it's not the top priority now.
We confirmed that the current concurrency scheme was not treating the zipping feature properly, resulting in a blocking behavior for the given worker. This provides a reasonable explanation both for a recent event and for a past one, in terms of worker timeouts.
Rather than making the zipping feature itself async, we are cleaning up a long-standing issue with our concurrency scheme for job execution (ref Review concurrency model for job execution #2218). The upcoming version makes it clear that the job-execution functions (starting from their top-level submit_workflow function) all run in a specific thread (or in other threads, if they are opened at lower levels) and they never re-use the same event-loop that handles API requests. In our understanding, this will prevent the kind of worker timeout we observed recently.
We did not modify or remove the "download-partial-zip-folder" feature, which is then identical to the current version. Warning: within this feature, the zipping must take place on-the-fly when the download-logs API request is handled. This means that for large folders the endpoint may not be responsive. Note that this is the same that we currently have, and not due to any of the ongoing updates. This may become less relevant if we expose granular-log downloads, in a future version.

jluethi · 2025-01-28T11:15:44Z

Thanks for the summary!

Rather than making the zipping feature itself async, we are cleaning up a long-standing issue with our concurrency scheme for job execution

So for my understanding: We wouldn't expect any worker timeouts due to zipping anymore with this server version? But we could still trigger them with the download partial zip folder (not a very common action) for the time being?

tcompa · 2025-01-28T11:30:13Z

We wouldn't expect any worker timeouts due to zipping anymore with this server version?

Agreed.

But we could still trigger them with the download partial zip folder (not a very common action) for the time being?

That's true (and it did not change with this version). As you note, this only applies to downloading logs for an ongoing job (where the zip has to be created on-the-fly).

Ref #2223

tcompa · 2025-01-28T16:11:34Z

Closing, as the relevant parts of this issue have been superseded by

tcompa added the runner label Jan 27, 2025

jluethi added this to Fractal Project Management Jan 27, 2025

github-project-automation bot moved this to TODO in Fractal Project Management Jan 27, 2025

tcompa assigned mfranzon Jan 27, 2025

tcompa mentioned this issue Jan 27, 2025

Review concurrency model for job execution #2218

Closed

tcompa mentioned this issue Jan 28, 2025

Compare zipping performance of zipfile and zip #2221

Open

tcompa changed the title ~~Move zipping of job folders from Python zipfile to gzip~~ Review zipping tool Jan 28, 2025

tcompa mentioned this issue Jan 28, 2025

Make the partial-job-folder-zip download endpoint non-blocking #2223

Closed

tcompa closed this as completed Jan 28, 2025

github-project-automation bot moved this from TODO to Done in Fractal Project Management Jan 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Review zipping tool #2215

Review zipping tool #2215

tcompa commented Jan 27, 2025

tcompa commented Jan 27, 2025 •

edited

Loading

jluethi commented Jan 27, 2025

tcompa commented Jan 27, 2025

tcompa commented Jan 27, 2025

jluethi commented Jan 27, 2025 •

edited

Loading

tcompa commented Jan 28, 2025

jluethi commented Jan 28, 2025

tcompa commented Jan 28, 2025

tcompa commented Jan 28, 2025

Review zipping tool #2215

Review zipping tool #2215

Comments

tcompa commented Jan 27, 2025

tcompa commented Jan 27, 2025 • edited Loading

jluethi commented Jan 27, 2025

tcompa commented Jan 27, 2025

tcompa commented Jan 27, 2025

jluethi commented Jan 27, 2025 • edited Loading

tcompa commented Jan 28, 2025

jluethi commented Jan 28, 2025

tcompa commented Jan 28, 2025

tcompa commented Jan 28, 2025

tcompa commented Jan 27, 2025 •

edited

Loading

jluethi commented Jan 27, 2025 •

edited

Loading