Grpo loss #553

kashif · 2025-02-01T11:37:35Z

Summary

Adds the GRPO chunked loss

fixes issue #548

Testing Done

Hardware Type:
run make test to ensure correctness
run make checkstyle to ensure code style
run make test-convergence to ensure convergence

src/liger_kernel/chunked_loss/grpo_loss.py

edbeeching · 2025-02-04T08:11:18Z

test/chunked_loss/test_grpo_loss.py

+    attention_mask.view(-1)[mask_indices] = 0
+
+    # Create rewards with shape [B, num_generations]
+    rewards = torch.randn(B * num_generations, device=device, dtype=dtype)


If it is not too much work, would it be possible to test a scenario where the rewards are same same, i.e all 1s?

edbeeching

LGTM, nice test.

RichhLi

Great work! Great use of comments

kashif · 2025-02-18T07:48:22Z

so there was a small change to the loss: huggingface/trl#2881 which i will integrate here too

src/liger_kernel/chunked_loss/grpo_loss.py

src/liger_kernel/chunked_loss/fused_linear_rlhf.py

lancerts · 2025-02-19T15:55:30Z

Nice work!!

kashif force-pushed the grpo-loss branch from f5f3157 to db962c9 Compare February 3, 2025 17:53

Superskyyy mentioned this pull request Feb 3, 2025

Tracking Liger-Kernel progress for GRPO Loss huggingface/trl#2756

Open

edbeeching reviewed Feb 4, 2025

View reviewed changes

src/liger_kernel/chunked_loss/grpo_loss.py Outdated Show resolved Hide resolved

edbeeching reviewed Feb 4, 2025

View reviewed changes

edbeeching approved these changes Feb 4, 2025

View reviewed changes

hongpeng-guo self-assigned this Feb 5, 2025

RichhLi reviewed Feb 18, 2025

View reviewed changes

RichhLi approved these changes Feb 18, 2025

View reviewed changes

lancerts reviewed Feb 18, 2025

View reviewed changes

src/liger_kernel/chunked_loss/grpo_loss.py Outdated Show resolved Hide resolved

shivam15s reviewed Feb 19, 2025

View reviewed changes

src/liger_kernel/chunked_loss/fused_linear_rlhf.py Outdated Show resolved Hide resolved

src/liger_kernel/chunked_loss/fused_linear_rlhf.py Outdated Show resolved Hide resolved

lancerts reviewed Feb 19, 2025

View reviewed changes

src/liger_kernel/chunked_loss/fused_linear_rlhf.py Outdated Show resolved Hide resolved

SalmanMohammadi mentioned this pull request Feb 19, 2025

[Tracking issue] Integrate native liger-kernel losses huggingface/trl#2495

Open

6 tasks

kashif added 15 commits February 19, 2025 14:20

init

377b874

initial LigerFusedLinearRLHFBase

63ff169

add tests

744208a

add tests

89e4835

fix backward

76898ab

use the base class

2931431

add num_generations

70c7c18

fix loss

0e60007

aux match the HF loss aux metrics

41b4aaa

return the same metrics

f9ea60c

fix for num_gen = 1

fe3e9b0

scale the loss

03c0147

scale the loss

f2fb01d

use same epsilon as TRL

40e662b

relax test thresholds

34ca02c

kashif force-pushed the grpo-loss branch from 29d2225 to 34ca02c Compare February 19, 2025 13:20

use eps

4f9c2ef

kashif added 3 commits February 19, 2025 15:18

fix default and use matmul

adf3c73

update the loss

8fbd213

formatting

d326a0e

lancerts approved these changes Feb 19, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Grpo loss #553

Grpo loss #553

kashif commented Feb 1, 2025

edbeeching Feb 4, 2025

edbeeching left a comment

RichhLi left a comment

kashif commented Feb 18, 2025

lancerts commented Feb 19, 2025

Grpo loss #553

Are you sure you want to change the base?

Grpo loss #553

Conversation

kashif commented Feb 1, 2025

Summary

Testing Done

edbeeching Feb 4, 2025

Choose a reason for hiding this comment

edbeeching left a comment

Choose a reason for hiding this comment

RichhLi left a comment

Choose a reason for hiding this comment

kashif commented Feb 18, 2025

lancerts commented Feb 19, 2025