Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama-node is not working in the moment #144

Closed
rozek opened this issue Nov 11, 2023 · 1 comment
Closed

llama-node is not working in the moment #144

rozek opened this issue Nov 11, 2023 · 1 comment

Comments

@rozek
Copy link

rozek commented Nov 11, 2023

Hello!

Atome-FE/llama-node is mentioned in the docs as a Node.js/JavaScript binding for rwkv.cpp.

Right now, however, it does not seem to work with the current version of rwkv.cpp (as mentioned in issue 121) - presumably because of the changes made in commit 8db73b1 ("update ggml"). The error message thrown is

llama.cpp: loading model from /Users/andreas/rozek/AI/RWKV/RWKV-5-World-0.1B-v1-20230803-ctx4096-Q4_1.bin
error loading model: unknown (magic, version) combination: 67676d66, 00000065; is this really a GGML file?
llama_init_from_file: failed to load model
node:internal/process/promises:288
            triggerUncaughtException(err, true /* fromPromise */);
            ^

[Error: Failed to initialize LLama context from file: /Users/andreas/rozek/AI/RWKV/RWKV-5-World-0.1B-v1-20230803-ctx4096-Q4_1.bin] {
  code: 'GenericFailure'
}

Node.js v18.17.0

Unfortunately, since I'm not a C++ programmer, I'm not able to revert just those changes which actually change the GGML handling - simply going back one commit further did not help (the resulting code did not compile)

Unless somebody will be able to help me to continue, I would recommend adding an appropriate note to the docs until "llama-node" has been fixed

@saharNooby
Copy link
Collaborator

saharNooby commented Nov 11, 2023

Hi!

I've replied in the issue in llama-node repo.

That said, the version of rwkv.cpp that llama-node uses is 6 months old. You would be missing on proper sequence mode support (which is a major optimization), soon to come RWKV v5 support (which finally makes RWKV competitive to Transformers) and other often invisible to end users, but still significant optimizations in ggml. My personal recommendation, if possible, is to use rwkv.cpp directly in your app. Being "not a C++ programmer" should not be an issue, since we have simple and powerful Python API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants