Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

float-type f32 will not start #81

Open
unclemusclez opened this issue Jun 1, 2024 · 2 comments
Open

float-type f32 will not start #81

unclemusclez opened this issue Jun 1, 2024 · 2 comments

Comments

@unclemusclez
Copy link

unclemusclez commented Jun 1, 2024

f32 will not start. i just converted the same model as q40 and seems to work fine. i tried with ./dllama inference as well

f32:

 sudo nice -n -20 ./dllama inference --model models/TinyLlama-1.1B-intermediate-step-480k-1T/dllama_model_TinyLlama-1.1B-intermediate-step-480k-1T_f32.m   --tokenizer models/TinyLlama-1.1B-intermediate-step-480k-1T/dllama_tokenizer_TinyLlama-1.1B-intermediate-step-480k-1T.t  --weights-float-type f32 --buffer-float-type f32 --nthreads 4  --workers 192.168.2.212:9998 192.168.2.213:9998 192.168.2.214:9998
💡 arch: llama
💡 hiddenAct: silu
💡 dim: 2048
💡 hiddenDim: 5632
💡 nLayers: 22
💡 nHeads: 32
💡 nKvHeads: 4
💡 vocabSize: 32000
💡 seqLen: 2048
💡 nSlices: 4
💡 ropeTheta: 10000.0
📄 bosId: 1
📄 eosId: 2
Killed
ubuntu@ubuntu:~$ sudo nice -n -20 ./dllama worker --port 9998 --nthreads 4
Listening on 0.0.0.0:9998...
terminate called after throwing an instance of 'ReadSocketException'
  what():  std::exception
Aborted

q40:

ubuntu@ubuntu:~/distributed-llama$ sudo nice -n -20 ./dllama-api --model models/TinyLlama-1.1B-intermediate-step-480k-1T/dllama_model_TinyLlama-1.1B-intermediate-step-480k-1T_q40.m   --tokenizer models/TinyLlama-1.1B-intermediate-step-480k-1T/dllama_tokenizer_TinyLlama-1.1B-intermediate-step-480k-1T.t  --weights-float-type q40 --buffer-float-type q80 --nthreads 4  --workers 192.168.2.212:9998 192.168.2.213:9998 192.168.2.214:9998
💡 arch: llama
💡 hiddenAct: silu
💡 dim: 2048
💡 hiddenDim: 5632
💡 nLayers: 22
💡 nHeads: 32
💡 nKvHeads: 4
💡 vocabSize: 32000
💡 seqLen: 2048
💡 nSlices: 4
💡 ropeTheta: 10000.0
📄 bosId: 1
📄 eosId: 2
🕒 ropeCache: 4096 kB
ubuntu@ubuntu:~$ sudo nice -n -20 ./dllama worker --port 9998 --nthreads 4
Listening on 0.0.0.0:9998...
💡 sliceIndex: 1
💡 nSlices: 4
🕒 ropeCache: 7680 kB
⏩ Received 6048 kB for block 0 (448 kB/s)
⏩ Received 6048 kB for block 1 (2729 kB/s)
⏩ Received 6048 kB for block 2 (2845 kB/s)
⏩ Received 6048 kB for block 3 (2786 kB/s)
⏩ Received 6048 kB for block 4 (2805 kB/s)
⏩ Received 6048 kB for block 5 (2925 kB/s)
⏩ Received 6048 kB for block 6 (2953 kB/s)
⏩ Received 6048 kB for block 7 (3095 kB/s)
⏩ Received 6048 kB for block 8 (3622 kB/s)
⏩ Received 6048 kB for block 9 (3830 kB/s)
⏩ Received 6048 kB for block 10 (3895 kB/s)
⏩ Received 6048 kB for block 11 (3849 kB/s)
⏩ Received 6048 kB for block 12 (3832 kB/s)
⏩ Received 6048 kB for block 13 (3847 kB/s)
⏩ Received 6048 kB for block 14 (3821 kB/s)
⏩ Received 6048 kB for block 15 (3922 kB/s)
⏩ Received 6048 kB for block 16 (3452 kB/s)
⏩ Received 6048 kB for block 17 (3859 kB/s)
⏩ Received 6048 kB for block 18 (3985 kB/s)
⏩ Received 6048 kB for block 19 (3379 kB/s)
⏩ Received 6048 kB for block 20 (3788 kB/s)
⏩ Received 6048 kB for block 21 (4115 kB/s)
@b4rtaz
Copy link
Owner

b4rtaz commented Jun 1, 2024

What is size of the dllama_tokenizer_TinyLlama-1.1B-intermediate-step-480k-1T.t file?

@unclemusclez
Copy link
Author

424K Jun 1 02:22 dllama_tokenizer_TinyLlama-1.1B-intermediate-step-480k-1T.t

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants