Implement Inference Caching in the Backend Instead of Frontend #56

MaxenceGui · 2024-02-09T15:30:51Z

Issue description

The goal is to enable caching capability in the backend system to store the outcome of an inference request made on an image. In case the users opt for different models for the same image, the inference request must be executed just once. Switching to a model that has already produced an inference on a chosen image should return the outcome that has been saved in the cache.

Steps to Reproduce

Send a request to the /inf route two time
The first time, the system returns the inference result from the pipeline
The second time, the system returns the result stored in the cache

Expected Behavior

If the cache does not contain any results for the chosen image and pipeline, the system should invoke the necessary model(s) to produce the inference result. On the other hand, if the system finds a matching image in the cache that corresponds to the pipeline inference result, it should simply return the cached result.

Current Behavior

The model(s) are called at every inference request since the inference result is not cached.

Possible Solution

This issue is referencing this comment:

I'm not exactly sure how it works on the front end, but there is already a caching functionality that keeps track of the result.

scores: [],
classifications: [],
boxes: [],
annotated: false,
imageDims: [],
overlapping: [],
overlappingIndices: [],

As of now, inferencing would overwrite what is in there. Maybe we could add another parameter that would contain the model that displays the score, classification, boxes, and everything else related?

 model:{
   scores: [],
   ...
}

function loadResultToCache

For the backend

To implement the same idea in the backend, we would need to keep track of the image passing in the inference request, pair it with the pipelines, and keep the result when returned. Then if the image is call again, we send back the result from cache.

---
title: Test
---
flowchart TD

image[Send image to inference request]
check(back end check if image already exists <br> in cache and if pipelines name is the same)
inference[image not in cache <br> or pipeline not call]
cache[image in cache <br> and pipeline call]
inf_res[send back result from inference]
cache_res[send back result stored in cache]

image -- from Front end--> check
check --> inference & cache
inference-->inf_res
cache-->cache_res

Originally posted by @MaxenceGui in ai-cfia/nachet-frontend#96 (comment)

Additional Context

There is already a caching functionality in the frontend to display the result

Tasks

Implement a caching functionality to store the inference result
Implement a checking in inference_request to return the cache result if True

The text was updated successfully, but these errors were encountered:

rngadam · 2024-02-12T13:16:14Z

There should be some discussion in th spec about the pipeline and the underlying models version. if a model version in the pipeline is updated, it should invalidate previous call. Also perhaps a mechanism to invalidate cache (from the frontend?).

https://martinfowler.com/bliki/TwoHardThings.html

MaxenceGui self-assigned this Feb 9, 2024

MaxenceGui added this to Nachet Feb 9, 2024

MaxenceGui moved this to Todo in Nachet Feb 20, 2024

MaxenceGui modified the milestone: M2(2024 March) Feb 27, 2024

MaxenceGui removed their assignment Mar 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Inference Caching in the Backend Instead of Frontend #56

Implement Inference Caching in the Backend Instead of Frontend #56

MaxenceGui commented Feb 9, 2024

rngadam commented Feb 12, 2024

Implement Inference Caching in the Backend Instead of Frontend #56

Implement Inference Caching in the Backend Instead of Frontend #56

Comments

MaxenceGui commented Feb 9, 2024

Issue description

Steps to Reproduce

Expected Behavior

Current Behavior

Possible Solution

For the backend

Additional Context

Tasks

rngadam commented Feb 12, 2024