-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot get take_test.py to run due to TypeError:vecquant4matmul() #3
Comments
edit the file |
That did indeed clear the errors about the tokenizers at the beginning, however I am still getting that other error at the bottom about "TypeError: vecquant4matmul(): incompatible function arguments. The following argument types are supported:
I'm using Pytorch 2., Cuda 11.7, Python 3.9 in my environment. I don't know each to figure out where this is coming from but earlier before this I noted an error "AttributeError: module 'quant_cuda' has no attribute 'vecquant4recons' and I checked my version of quant_cuda and indeed it had no such atttibute but it did have "vequant4recons_v1" and _v2 so I changed the code to be _v1. So I am wondering if this is a versioning problem? If this might be part of the problem, may I ask which version of cuda_quant you're using? Again I'm on Windows 10 Powershell miniconda env I built just for this install so it is relatively clean. Thanks! |
I think this all comes from atreat:~/dev/large_language_models/haltt4llm (main)> pip show gptq_llama because: atreat:~/dev/large_language_models/haltt4llm (main)> grep quant_cuda * |
Running on Windows 10, conda build environment, I think I have all the right torch, Cuda etc modules installed, requirements.txt all installed well. Running the NOTA (None of the Above) Trivia test on Alpaca Lora 7B (4bit)
python take_test.py --trivia fake_trivia_questions.json
model and weights installed from huggingface repos and in the correct directories, but there is some error before the first question about tokenizer class being different, so coudl be related to that.... Anyway get the following error:
Found 1 GPU(s) available.
Using device: cuda:0
Loading Model ...
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is 'LLaMATokenizer'.
The class this function is called from is 'LlamaTokenizer'.
Loaded the model in 2.36 seconds.
Fitting 4bit scales and zeros to half
Question 1: What type of energy is used to power the Salkran Tower of Pertinax in the Carpathian Mountains of Romania?
A. Solar
B. Wind
C. Gravitonic
D. None of the above
E. I don't know
Traceback (most recent call last):
File "D:\devgit\haltt4llm\take_test.py", line 189, in
main()
....
TypeError: vecquant4matmul(): incompatible function arguments. The following argument types are supported:
1. (arg0: torch.Tensor, arg1: torch.Tensor, arg2: torch.Tensor, arg3: torch.Tensor, arg4: torch.Tensor, arg5: int) -> None
Invoked with: tensor([[-0.0148, -0.0238, 0.0097, ..., 0.0231, -0.0175, 0.0318]],
device='cuda:0'), tensor([[ 2004248423, 2020046951, 1734903431, ..., -2024113529,
-1772648858, 1988708488],
[ 2004318071, 1985447543, 1719101303, ..., 1738958728,
1734834296, 1988584549],
[-2006481289, -2038991241, 2003200134, ..., -1734780278,
-2055714936, -1401572265],
...,
[-2022213769, -2021226889, 1735947895, ..., 2002357398,
1483176039, -1215859063],
[ 2005366614, -2022148249, 1752733576, ..., 394557864,
1986418055, 1483962710],
[ 1735820935, 1988720743, -2056755593, ..., -1468438152,
1718123383, 1150911352]], device='cuda:0', dtype=torch.int32), tensor([[0., 0., 0., ..., 0., 0., 0.]], device='cuda:0'), tensor([[0.0318],
[0.0154],
[0.0123],
...,
[0.0191],
[0.0206],
[0.0137]], device='cuda:0'), tensor([[0.2229],
[0.1078],
[0.0860],
...,
[0.1528],
[0.1439],
[0.0959]], device='cuda:0')
Any ideas why?
The text was updated successfully, but these errors were encountered: