Fix regression tests on CUDA and ROCm #23

jakki-amd · 2024-11-18T13:28:19Z

Fix regression tests

eppane · 2024-11-25T15:43:11Z

Appended the comment in pull request #5:

OBS! The test test_handler.py::test_huggingface_bert_model_parallel_inference fails due to:

ValueError: Input length of input_ids is 150, but max_length is set to 50. This can lead to unexpected behavior. You should consider increasing max_length or, better yet, setting max_new_tokens.

This indicates that preprocessing uses a different max_length than inference, which can be verified when looking at the handler when the test was originally implemented: model.generate() has max_length=50 by default, while tokenizer uses max_length from setup_config (max_length=150). It seems that the bert-based Textgeneration.mar needs an update.

jakki-amd self-assigned this Nov 18, 2024

jakki-amd added this to the Week 47 milestone Nov 18, 2024

jakki-amd assigned eppane and unassigned jakki-amd Nov 25, 2024

jakki-amd modified the milestones: Week 47, Week 48 Nov 25, 2024

eppane closed this as completed Nov 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix regression tests on CUDA and ROCm #23

Fix regression tests on CUDA and ROCm #23

jakki-amd commented Nov 18, 2024

eppane commented Nov 25, 2024 •

edited

Loading

Fix regression tests on CUDA and ROCm #23

Fix regression tests on CUDA and ROCm #23

Comments

jakki-amd commented Nov 18, 2024

eppane commented Nov 25, 2024 • edited Loading

eppane commented Nov 25, 2024 •

edited

Loading