[BUG] max_seq_len can not be <= 2047 #240

Originalimoc · 2024-11-15T09:22:45Z

OS

Linux

GPU Library

CUDA 12.x

Python version

3.11

Describe the bug

When setting max_seq_len <= 2047 (or not a multiple of 256 && cache_size (un)set?) will trigger a logic bug(?) in"https://github.com/turboderp/exllamav2/blob/master/exllamav2/generator/dynamic.py#L392":

  File "tabbyAPI/git/backends/exllamav2/model.py", line 716, in create_generator
    self.generator = ExLlamaV2DynamicGeneratorAsync(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/imoc/miniconda3/envs/tabbyAPI/lib/python3.11/site-packages/exllamav2/generator/dynamic_async.py", line 16, in __init__
    self.generator = ExLlamaV2DynamicGenerator(*args, **kwargs)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/imoc/miniconda3/envs/tabbyAPI/lib/python3.11/site-packages/exllamav2/generator/dynamic.py", line 401, in __init__
    assert self.max_chunk_size % self.page_size == 0, \
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: max_chunk_size must be multiple of 256, received None

Reproduction steps

Launch a default server with max_seq_len set to 2047 or lower or (not a multiple of 256?)

Expected behavior

All backend parameter being set correctly even with the WARNING:
"WARNING: The given cache size (2047) is not a multiple of 256.
WARNING: Overriding cache_size with an overestimated value of 2048 tokens."

Acknowledgements

I have looked for similar issues before submitting this one.
I have read the disclaimer, and this issue is related to a code bug. If I have a question, I will use the Discord server.
I understand that the developers have lives and my issue will be answered when possible.
I understand the developers of this program are human, and I will ask my questions politely.

The text was updated successfully, but these errors were encountered:

DocShotgun · 2024-11-16T04:44:07Z

Can you try this? #243

I was able to load a model with a max_seq_len of 1337 and have the cache_size and chunk_size be autocorrected.

Originalimoc added the bug Something isn't working label Nov 15, 2024

DocShotgun mentioned this issue Nov 16, 2024

Enforce chunk_size as multiple of 256 #243

Merged

bdashore3 closed this as completed in #243 Nov 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] max_seq_len can not be <= 2047 #240

[BUG] max_seq_len can not be <= 2047 #240

Originalimoc commented Nov 15, 2024 •

edited

Loading

DocShotgun commented Nov 16, 2024

[BUG] max_seq_len can not be <= 2047 #240

[BUG] max_seq_len can not be <= 2047 #240

Comments

Originalimoc commented Nov 15, 2024 • edited Loading

OS

GPU Library

Python version

Describe the bug

Reproduction steps

Expected behavior

Acknowledgements

DocShotgun commented Nov 16, 2024

Originalimoc commented Nov 15, 2024 •

edited

Loading