Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Host Executor uses Functions Executor processes to run customer funct…
…ions (#1102) * Host Executor uses Functions Executor processes to run customer functions All tasks for a function version are routed into the same Function Executor process because they support concurrent running of tasks. Host executor implements "one Function Executor for all function versions" policy. Indexify customers rely on the fact that they always have a single Function Executor per function for all function versions. This ensures that a function that uses GPU can run on the same machine for any of its versions. This policy is supposed to be a temporary solution until server starts telling Executor what Function Executors to start and finish explicitly. Also removed redundant num_workers configuration and some other dead code. This configuration is not used and is now completely obsolete with the Function Executor in separate processes. Function Executor processes take about 1.2 sec to start and setup a gRPC channel at about p50 with rare durations jumping up to 3 sec p100. I removed all locks from the code and it didn't result in any speedup. I measured that these delays are comming from library calls `await asyncio.create_subprocess_exec` and `asyncio.wait_for(channel.channel_ready())`. There's no plan to fix them for now as it'd take a significant effort to figure out what's going on in the libraries. Instead we bump the cold start duration limit in the test to 5 sec from 100 ms. This limit is still okay for the product as we don't have a hard requirement for cold start durations yet. Looks like gRPC also added about 10 ms of latency to warm starts. This is a minor regression. Doesn't require the test limit change. Before: ``` cold_start_duration: 0.009701967239379883 seconds warm_start_duration: 0.006789207458496094 seconds ``` After: ``` cold_start_duration: 1.2409627437591553 seconds warm_start_duration: 0.016773223876953125 seconds ``` Host Executor is using abstract FunctionExecutor objects so them and their Factory can be replaced in the future if e.g. we want to run Function Executors not as processes but as threads or containers. Testing: make test make fmt make check * Add a test for file descriptor caching It can take minutes for a model to load into a GPU. Indexify customers rely on the fact that they can cache file descriptors of loaded models between tasks of the same function version. Added a test that verifies this. Testing: make fmt make check make test
- Loading branch information