Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sampler variants #64

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

Sampler variants #64

wants to merge 11 commits into from

Conversation

AsbjornOlling
Copy link
Contributor

@AsbjornOlling AsbjornOlling commented Dec 16, 2024

This MR lets us implement different sampling methods with mututally-exclusive configuration options, without showing irrelevant options in the UI.

It's currently very boilerplate-y. I'm tempted to try and write a proc macro to generate the repetitive code.

This will be a breaking change, since it overhauls the sampling API in a backwards-incompatible manner.

@AsbjornOlling
Copy link
Contributor Author

I wrote some low-effort macros. I think it's not too bad.

I also implemented all of the llama.cpp samplers available.

The only one I didn't add was "Infill". I don't really understand it, and it seems like there aren't rust bindings for it.

@AsbjornOlling
Copy link
Contributor Author

It would be nice to group the "penalty" parameters together in some way.

@AsbjornOlling AsbjornOlling changed the title Draft: Sampler variants Sampler variants Dec 17, 2024
@emilnorsker
Copy link
Contributor

 ERROR: Model worker crashed: Lama.cpp failed fetching chat template: the model has no meta val - returned code -1
   at: <nobodywho::NobodyWhoChat as godot_core::gen::classes::node::re_export::INode>::physics_process (src\lib.rs:117)
ERROR: Model output channel died. Did the LLM worker crash?
   at: <nobodywho::NobodyWhoChat as godot_core::gen::classes::node::re_export::INode>::physics_process (src\lib.rs:123)
ERROR: Model worker crashed: Lama.cpp failed fetching chat template: the model has no meta val - returned code -1
   at: <nobodywho::NobodyWhoChat as godot_core::gen::classes::node::re_export::INode>::physics_process (src\lib.rs:117)
ERROR: Model output channel died. Did the LLM worker crash?
   at: <nobodywho::NobodyWhoChat as godot_core::gen::classes::node::re_export::INode>::physics_process (src\lib.rs:123)

I am getting this error after changing to this branch. The only model that i have found that it works with is Gemma 2 2B.

i have tried the following models:

  • llama-2 7b Q4
  • llama 3.2 3b Q2

Note: this is after upgrading the plugin in an existing repository. I will try to replicate in a new Godot project as well.
also downloading a new non llama model to check if it is just llama models.

@emilnorsker
Copy link
Contributor

Alright i tested two more models which works, so it looks like its only llama models that doesn't work... The error persist on a clean project.

@emilnorsker
Copy link
Contributor

I also tested the different parameters on the greedy sampler:

penalty last n = 10 => gave the same token everytime. which means it works.
penalty repeat = -100, 100, 10 => gave the token the several times and kept the exact same output. This looks like something here is broken?
penealty freq = 1000 no changes, same goes for -1000.
same results with the rest of the parameters

@AsbjornOlling
Copy link
Contributor Author

 ERROR: Model worker crashed: Lama.cpp failed fetching chat template: the model has no meta val - returned code -1
   at: <nobodywho::NobodyWhoChat as godot_core::gen::classes::node::re_export::INode>::physics_process (src\lib.rs:117)
ERROR: Model output channel died. Did the LLM worker crash?
   at: <nobodywho::NobodyWhoChat as godot_core::gen::classes::node::re_export::INode>::physics_process (src\lib.rs:123)
ERROR: Model worker crashed: Lama.cpp failed fetching chat template: the model has no meta val - returned code -1
   at: <nobodywho::NobodyWhoChat as godot_core::gen::classes::node::re_export::INode>::physics_process (src\lib.rs:117)
ERROR: Model output channel died. Did the LLM worker crash?
   at: <nobodywho::NobodyWhoChat as godot_core::gen::classes::node::re_export::INode>::physics_process (src\lib.rs:123)

I am getting this error after changing to this branch. The only model that i have found that it works with is Gemma 2 2B.

i have tried the following models:

  • llama-2 7b Q4
  • llama 3.2 3b Q2

Note: this is after upgrading the plugin in an existing repository. I will try to replicate in a new Godot project as well. also downloading a new non llama model to check if it is just llama models.

this error appears on the main branch as well as this branch, for the llama2 model mentioned. I think the issue is that old gguf files don't have all the same metadata fields that newer llama.cpp expects. either way, this has nothing to do with the current MR.

@AsbjornOlling
Copy link
Contributor Author

I also tested the different parameters on the greedy sampler:

penalty last n = 10 => gave the same token everytime. which means it works. penalty repeat = -100, 100, 10 => gave the token the several times and kept the exact same output. This looks like something here is broken? penealty freq = 1000 no changes, same goes for -1000. same results with the rest of the parameters

My best guess is that you get weird results because you used weird values.

Have a look at the llama.cpp docs for a bit more explanation: https://github.com/ggerganov/llama.cpp/blob/master/examples/main/README.md

It might be worthwhile to set explicit ranges for the sampler config values, to give people an idea of what values are sane

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants