NoneType object not subscriptable error in quantize.py #2818
Labels
Investigating
Low Precision
Issue about lower bit quantization, including int8, int4, fp8
triaged
Issue has been triaged by maintainers
I'm following the steps in the MLLama example and the TensorRT-LLM multimodal documentation.
When I run the following command:
The process runs for a while but eventually fails with the following error:
TypeError: 'NoneType' object is not subscriptable
trying to find model.architecture[0]
Model: LLaMA 3.2-90B-Vision
Precision: bfloat16 → fp8
OS: Ubuntu
GPU: GH200 x 2
CUDA version: 12.6
cuDNN version: 9.7.1.26-1
Please let me know if additional logs or debugging steps are needed.
The text was updated successfully, but these errors were encountered: