I am unable to reproduce the results from the paper for llama-7B-32k-longlora ppl. #188

masteryqq · 2024-05-28T12:26:47Z

After enabling flash_attn, I am unable to reproduce the results from the paper for llama-7B-32k-longlora. The paper reports a perplexity (ppl) of 7.8 at a sequence length (seq_len) of 4096; however, my result stands at 9.8.(using your eval_distributed.py)

masteryqq · 2024-05-28T12:31:58Z

masteryqq closed this as completed May 28, 2024

masteryqq reopened this May 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I am unable to reproduce the results from the paper for llama-7B-32k-longlora ppl. #188

I am unable to reproduce the results from the paper for llama-7B-32k-longlora ppl. #188

masteryqq commented May 28, 2024 •

edited

Loading

masteryqq commented May 28, 2024

I am unable to reproduce the results from the paper for llama-7B-32k-longlora ppl. #188

I am unable to reproduce the results from the paper for llama-7B-32k-longlora ppl. #188

Comments

masteryqq commented May 28, 2024 • edited Loading

masteryqq commented May 28, 2024

masteryqq commented May 28, 2024 •

edited

Loading