-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added diskcache to base model. #480
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great addition ! Ping us when ready :)
Thanks, I don't know when I have the capacity to add it to the other methods. |
This might not be necessary anymore with PR #488. |
Want us to close this one? |
I personally think it would still be nice to have caching here too, but for me it is not strictly necessary anymore I guess. |
To make local inference of large models more robust it would still be useful. |
Some models are very expensive to run inference on (e.g., Llama-3.3-70B). When we need to rerun inference to add a new metric for example, it would be very time consuming and expensive, especially since at least 4 80GB GPUs are necessary for inference.
We might want to add a flag to enable/disable caching. Also, we might want it for the other methods like loglikelihood generation too.