lm-harness `IndexError: list index out of range` due to missing `bnb_4bit_compute_dtype` #48

aittalam · 2024-02-13T13:55:47Z

The quantization parameter bnb_4bit_compute_dtype seem to be optional for fine-tuning, but when we run lm-harness we get an error if it is not specified in the config.

Example: http://10.145.91.219:8265/#/jobs/raysubmit_RvS6DHdgeMPYQjW9

Solution: the code runs properly if we add the parameter to the config, e.g.

quantization:
  load_in_4bit: True
  bnb_4bit_quant_type: "fp4"
  bnb_4bit_compute_dtype: "bfloat16"

Shall we always put a default value (e.g. bfloat16) or just document it to make sure it is specified when needed?

The text was updated successfully, but these errors were encountered:

sfriedowitz · 2024-02-13T16:09:47Z

This is a good opportunity to use Pydantic @model_validator to validate the combination of parameters on the configuration, ensuring a fail-early approach to invalid choices.

sfriedowitz · 2024-02-13T21:57:47Z

BTW, in the newest versions of peft, their LORA implementation comes with a built-in quantization method called LoftQ. They mention that you should use that and not pass a quantized model when training an adapter: https://github.com/huggingface/peft/blob/234774345b1c2a8c89d7851b85a3f4fdc89bd454/src/peft/tuners/lora/config.py#L100

So if you were doing finetuning here, you probably don't need to be using the quantization config at all.

aittalam · 2024-02-14T09:38:34Z

That is good to know!
I got the error in lm-harness eval though. What puzzled me was that I got no errors during fine-tuning but I got one during eval - perhaps that's something required at inference time while it is somehow implied by HF code at tuning time?

sfriedowitz · 2024-02-14T15:49:20Z

Interesting, I'm not sure. Is it possible that you were already passing the LoftQ config during finetuning and that was handling quantization for you? I'm not positive about the defaults used by HuggingFace when not explicitly specified.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lm-harness `IndexError: list index out of range` due to missing `bnb_4bit_compute_dtype` #48

lm-harness `IndexError: list index out of range` due to missing `bnb_4bit_compute_dtype` #48

aittalam commented Feb 13, 2024

sfriedowitz commented Feb 13, 2024

sfriedowitz commented Feb 13, 2024

aittalam commented Feb 14, 2024

sfriedowitz commented Feb 14, 2024

lm-harness IndexError: list index out of range due to missing bnb_4bit_compute_dtype #48

lm-harness IndexError: list index out of range due to missing bnb_4bit_compute_dtype #48

Comments

aittalam commented Feb 13, 2024

sfriedowitz commented Feb 13, 2024

sfriedowitz commented Feb 13, 2024

aittalam commented Feb 14, 2024

sfriedowitz commented Feb 14, 2024

lm-harness `IndexError: list index out of range` due to missing `bnb_4bit_compute_dtype` #48

lm-harness `IndexError: list index out of range` due to missing `bnb_4bit_compute_dtype` #48