[Feature] Add reduction parameter to On-Policy losses. #1890

albertbou92 · 2024-02-09T10:19:11Z

Description

This PR introduces a reduction option to the on-policy losses, similar to how Torch does it (e.g. https://pytorch.org/docs/stable/generated/torch.nn.MSELoss.html).

The ideas is to validate the approach for on-policy losses, the then move on to replicate it in the other losses.

Motivation and Context

Why is this change required? What problem does it solve?
If it fixes an open issue, please link to the issue here.
You can use the syntax close #15213 if this solves the issue #15213

I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds core functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (update in the documentation)
Example (update in the folder of examples)

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

I have read the CONTRIBUTION guide (required)
My change requires a change to the documentation.
I have updated the tests accordingly (required for a bug fix or a new feature).
I have updated the documentation accordingly.

pytorch-bot · 2024-02-09T10:19:15Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/1890

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (5 Unrelated Failures)

As of commit 8052e33 with merge base 899af07 ():

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

Continuous Benchmark (PR) / CPU Pytest benchmark (gh)
Workflow failed! Resource not accessible by integration
Continuous Benchmark (PR) / GPU Pytest benchmark (gh)
Workflow failed! Resource not accessible by integration
Habitat Tests on Linux / tests (3.9, 11.6) / linux-job (gh)
RuntimeError: Command docker exec -t 3d397f92035cc11ee3d8f266d3c8dc389ec818a1447cde0555bb4b060913d5d2 /exec failed with exit code 139
Unit-tests on MacOS CPU / tests (3.8) / macos-job (gh)
RuntimeError: Command bash /Users/runner/work/_temp/exec_script failed with exit code 1

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

Unit-tests on Windows / unittests-gpu / windows-job (gh)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

test/test_cost.py

# Conflicts: # torchrl/objectives/common.py

vmoens

Great stuff!
I wonder if long term we should not structure the loss better, to avoid the batch-size issue pointed here.

TensorDict({"loss": {"actor": tensor, ...}, "metadata": {...}}, [])

which could let us set a different batch-size at diferent levels.
It isn't going to be easy to move to that format though! So for now I think the best would be to keep the output without batch-size until we figure out how to account for it long-term.

test/test_cost.py

torchrl/objectives/ppo.py

torchrl/objectives/reinforce.py

Co-authored-by: Vincent Moens <[email protected]>

albertbou92 · 2024-02-12T09:36:45Z

I incorporated a part of the suggestions and left comments in the points that might need further discussion

… loss_reduction

vmoens

LGTM just smth minor in the tests

test/test_cost.py

albertbou92 added 10 commits February 9, 2024 08:20

ppo reduction

670108a

ppo reduction

668d212

ppo reduction

e624083

ppo reduction

26cd568

ppo reduction

3b253fc

ppo reduction

81e6a3a

a2c / reinforce reduction

8560e8b

a2c / reinforce reduction

e747cec

a2c / reinforce tests

b7c249c

format

ea93914

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 9, 2024

albertbou92 commented Feb 9, 2024

View reviewed changes

test/test_cost.py Outdated Show resolved Hide resolved

vmoens added the enhancement New feature or request label Feb 10, 2024

vmoens added 3 commits February 10, 2024 20:35

Merge remote-tracking branch 'origin/main' into loss_reduction

666e7b1

fix recursion issue

b5ef409

Merge remote-tracking branch 'origin/main' into loss_reduction

6275f02

# Conflicts: # torchrl/objectives/common.py

vmoens reviewed Feb 10, 2024

View reviewed changes

vmoens and others added 8 commits February 11, 2024 16:05

init

107e875

Merge branch 'fix-loss-exploration' into loss_reduction

3163d1d

Update torchrl/objectives/reinforce.py

331bd38

Co-authored-by: Vincent Moens <[email protected]>

Update torchrl/objectives/ppo.py

61fc41b

Co-authored-by: Vincent Moens <[email protected]>

Update torchrl/objectives/a2c.py

6dbb622

Co-authored-by: Vincent Moens <[email protected]>

Update torchrl/objectives/ppo.py

2d6674e

Co-authored-by: Vincent Moens <[email protected]>

Update torchrl/objectives/ppo.py

95efebd

Co-authored-by: Vincent Moens <[email protected]>

suggestions added

e64ee3d

format

5368bdc

albertbou92 requested a review from vmoens February 12, 2024 09:50

Merge remote-tracking branch 'origin/main' into loss_reduction

efaa893

vmoens and others added 6 commits February 12, 2024 11:46

Merge branch 'loss_reduction' of https://github.com/PyTorchRL/rl into…

ac115a3

… loss_reduction

default reduction none

c218352

Merge branch 'main' into loss_reduction

eebcbb4

remove bs from loss

7e516f8

fix test

2701bb8

format

566b2b9

vmoens approved these changes Feb 15, 2024

View reviewed changes

test/test_cost.py Outdated Show resolved Hide resolved

albertbou92 added 2 commits February 15, 2024 15:52

better tests

7e6b4b2

better tests

8052e33

vmoens merged commit 67f659c into pytorch:main Feb 15, 2024
63 of 68 checks passed

vmoens deleted the loss_reduction branch February 15, 2024 16:35

albertbou92 mentioned this pull request Feb 23, 2024

[Feature] Add reduction parameter to Off-Policy losses. #1956

Merged

10 tasks

albertbou92 mentioned this pull request Mar 1, 2024

[Feature] Offline objectives reduction parameter #1984

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Add reduction parameter to On-Policy losses. #1890

[Feature] Add reduction parameter to On-Policy losses. #1890

albertbou92 commented Feb 9, 2024

pytorch-bot bot commented Feb 9, 2024 •

edited

Loading

vmoens left a comment

albertbou92 commented Feb 12, 2024 •

edited

Loading

vmoens left a comment

[Feature] Add reduction parameter to On-Policy losses. #1890

[Feature] Add reduction parameter to On-Policy losses. #1890

Conversation

albertbou92 commented Feb 9, 2024

Description

Motivation and Context

Types of changes

Checklist

pytorch-bot bot commented Feb 9, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/1890

✅ You can merge normally! (5 Unrelated Failures)

vmoens left a comment

Choose a reason for hiding this comment

albertbou92 commented Feb 12, 2024 • edited Loading

vmoens left a comment

Choose a reason for hiding this comment

pytorch-bot bot commented Feb 9, 2024 •

edited

Loading

albertbou92 commented Feb 12, 2024 •

edited

Loading