You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The goal is to enable caching capability in the backend system to store the outcome of an inference request made on an image. In case the users opt for different models for the same image, the inference request must be executed just once. Switching to a model that has already produced an inference on a chosen image should return the outcome that has been saved in the cache.
Steps to Reproduce
Send a request to the /inf route two time
The first time, the system returns the inference result from the pipeline
The second time, the system returns the result stored in the cache
Expected Behavior
If the cache does not contain any results for the chosen image and pipeline, the system should invoke the necessary model(s) to produce the inference result. On the other hand, if the system finds a matching image in the cache that corresponds to the pipeline inference result, it should simply return the cached result.
Current Behavior
The model(s) are called at every inference request since the inference result is not cached.
Possible Solution
This issue is referencing this comment:
I'm not exactly sure how it works on the front end, but there is already a caching functionality that keeps track of the result.
As of now, inferencing would overwrite what is in there. Maybe we could add another parameter that would contain the model that displays the score, classification, boxes, and everything else related?
To implement the same idea in the backend, we would need to keep track of the image passing in the inference request, pair it with the pipelines, and keep the result when returned. Then if the image is call again, we send back the result from cache.
---
title: Test
---
flowchart TD
image[Send image to inference request]
check(back end check if image already exists <br> in cache and if pipelines name is the same)
inference[image not in cache <br> or pipeline not call]
cache[image in cache <br> and pipeline call]
inf_res[send back result from inference]
cache_res[send back result stored in cache]
image -- from Front end--> check
check --> inference & cache
inference-->inf_res
cache-->cache_res
There should be some discussion in th spec about the pipeline and the underlying models version. if a model version in the pipeline is updated, it should invalidate previous call. Also perhaps a mechanism to invalidate cache (from the frontend?).
Issue description
The goal is to enable caching capability in the backend system to store the outcome of an inference request made on an image. In case the users opt for different models for the same image, the inference request must be executed just once. Switching to a model that has already produced an inference on a chosen image should return the outcome that has been saved in the cache.
Steps to Reproduce
/inf
route two timeExpected Behavior
If the cache does not contain any results for the chosen image and pipeline, the system should invoke the necessary model(s) to produce the inference result. On the other hand, if the system finds a matching image in the cache that corresponds to the pipeline inference result, it should simply return the cached result.
Current Behavior
The model(s) are called at every inference request since the inference result is not cached.
Possible Solution
This issue is referencing this comment:
I'm not exactly sure how it works on the front end, but there is already a caching functionality that keeps track of the result.
As of now, inferencing would overwrite what is in there. Maybe we could add another parameter that would contain the model that displays the score, classification, boxes, and everything else related?
function loadResultToCache
For the backend
To implement the same idea in the backend, we would need to keep track of the image passing in the inference request, pair it with the pipelines, and keep the result when returned. Then if the image is call again, we send back the result from cache.
Originally posted by @MaxenceGui in ai-cfia/nachet-frontend#96 (comment)
Additional Context
Tasks
inference_request
to return the cache result if TrueThe text was updated successfully, but these errors were encountered: