[Algorithm] CrossQ #2033

BY571 · 2024-03-21T17:47:04Z

Description

Motivation and Context

Why is this change required? What problem does it solve?
If it fixes an open issue, please link to the issue here.
You can use the syntax close #15213 if this solves the issue #15213

I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds core functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (update in the documentation)
Example (update in the folder of examples)

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

I have read the CONTRIBUTION guide (required)
My change requires a change to the documentation.
I have updated the tests accordingly (required for a bug fix or a new feature).
I have updated the documentation accordingly.

pytorch-bot · 2024-03-21T17:47:07Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2033

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 New Failures, 1 Unrelated Failure

As of commit c010e39 with merge base a151923 ():

NEW FAILURES - The following jobs have failed:

Continuous Benchmark (PR) / CPU Pytest benchmark (gh)
Workflow failed! Resource not accessible by integration
Continuous Benchmark (PR) / GPU Pytest benchmark (gh)
Workflow failed! Resource not accessible by integration
Habitat Tests on Linux / tests (3.9, 12.1) / linux-job (gh)
RuntimeError: Command docker exec -t 3d0ffbe271e081f1b65cbe69984f0ceb263fe7ce8ac7975a746227e56587e4df /exec failed with exit code 139
Unit-tests on Linux / tests-optdeps (3.10, 12.1) / linux-job (gh)
RuntimeError: Command docker exec -t 19b79cee76eab8416c0681b76414c6fdb9d19f9e8db8fd5a2fd1f9d7cbc616e0 /exec failed with exit code 1

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

Unit-tests on Windows / unittests-cpu / windows-job (gh) (trunk failure)
test/test_transforms.py::TestActionDiscretizer::test_trans_parallel_env_check[False]

This comment was automatically generated by Dr. CI and updates every 15 minutes.

torchrl/objectives/crossq.py

BY571 · 2024-03-21T18:02:25Z

Performance with separate target_computation looks good:

But we need to check for speed. It should be similar to our sac implementation.

torchrl/objectives/crossq.py

# Conflicts: # .github/unittest/linux_examples/scripts/run_test.sh

vmoens

thanks for this
There are just a couple of things to fix before merging

sota-implementations/crossq/crossq.py

sota-implementations/crossq/utils.py

torchrl/objectives/crossq.py

vmoens · 2024-06-12T08:55:57Z

@BY571 we should also add it to the sota benchmarks

sota-implementations/crossq/batchrenorm.py

vmoens · 2024-06-27T11:34:17Z

sota-implementations/crossq/batchrenorm.py

+import torch.nn as nn
+
+
+class BatchRenorm(nn.Module):


Let's put this in the modules no?

and add it to the doc.
Happy to write a couple of tests.
Is it a copy paste? If so, can we check the license?

sota-implementations/crossq/batchrenorm.py

torchrl/objectives/crossq.py

vmoens · 2024-06-27T15:50:30Z

torchrl/objectives/crossq.py

+                next_tensordict.set(self.tensor_keys.action, next_action)
+                next_sample_log_prob = next_dist.log_prob(next_action)
+
+        # TODO: separate forward pass seems faster than the combined.


torchrl/objectives/crossq.py

# Conflicts: # torchrl/envs/batched_envs.py

vmoens · 2024-07-09T07:36:01Z

test/test_cost.py

+            loss_function="l2",
+            **kwargs,
+        )
+        sd = loss_fn.state_dict()


we should check that this contains the values of the actor and qvalue nets

torchrl/objectives/crossq.py

vmoens · 2024-07-09T14:40:46Z

@BY571 examples CI is failing:

  File "/pytorch/rl/sota-implementations/crossq/utils.py", line 244, in make_loss_module
    loss_module = CrossQLoss(
TypeError: __init__() got an unexpected keyword argument 'delay_actor'

BY571 · 2024-07-09T17:08:29Z

@vmoens updated :)

vmoens

LGTM thanks a million for this

BY571 added 7 commits March 20, 2024 20:42

add crossQ examples

0a23ae8

add loss

9bdee71

Update naming experiment

570a20e

update

5086249

update add tests

c3a927f

detach

d1c9c34

update tests

e879b7c

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 21, 2024

BY571 added 2 commits March 21, 2024 18:50

update run_test.sh

75255e7

move crossq to sota-implementations

a7b79c3

BY571 commented Mar 21, 2024

View reviewed changes

torchrl/objectives/crossq.py Outdated Show resolved Hide resolved

update loss

be84f3f

BY571 marked this pull request as ready for review March 26, 2024 18:40

update cat prediction

2170ad8

BY571 commented Mar 26, 2024

View reviewed changes

torchrl/objectives/crossq.py Show resolved Hide resolved

vmoens added the new algo New algorithm request or PR label Apr 8, 2024

Merge branch 'main' into crossQ

75d4cee

# Conflicts: # .github/unittest/linux_examples/scripts/run_test.sh

vmoens reviewed Jun 12, 2024

View reviewed changes

BY571 added 8 commits June 26, 2024 14:43

Merge branch 'main' into crossQ

7711a4e

add batchrenorm to crossq

f0ac167

Merge branch 'crossQ' of github.com:BY571/rl into crossQ

37abb14

small fixes

bc7675a

update docs and sota checks

9543f2e

hyperparam fix

53e35f7

test

172e1c0

update batch norm tests

fdb7e8b

vmoens reviewed Jun 27, 2024

View reviewed changes

BY571 and others added 17 commits July 8, 2024 09:32

update lr param

02c94ff

Merge branch 'crossQ' of https://github.com/BY571/rl into crossQ

93b6a7b

Apply suggestions from code review

4b914e6

Merge remote-tracking branch 'origin/main' into crossQ

af8c64a

Merge branch 'crossQ' of https://github.com/BY571/rl into crossQ

845c8a9

set qnet eval in actor loss

7b4a69d

Merge branch 'crossQ' of https://github.com/BY571/rl into crossQ

77de044

take off comment

35c7a98

amend

68a1a9f

Merge branch 'crossQ' of https://github.com/BY571/rl into crossQ

c04eb3b

Merge remote-tracking branch 'origin/main' into crossQ

12672ee

# Conflicts: # torchrl/envs/batched_envs.py

amend

7fbb27d

amend

ff80481

amend

caf702e

amend

70e2882

amend

ccd1b7f

Merge remote-tracking branch 'origin/main' into crossQ

d3c8b0e

# Conflicts: # torchrl/envs/batched_envs.py

vmoens reviewed Jul 9, 2024

View reviewed changes

vmoens and others added 6 commits July 9, 2024 09:07

Apply suggestions from code review

d3e0bb1

amend

349cb28

amend

75a43e7

fix device error

abada6c

Update objective delay actor

c878b81

Update tests not expecting target update

f222b11

update example utils

067b560

amend

c010e39

vmoens approved these changes Jul 10, 2024

View reviewed changes

vmoens merged commit a0a47a9 into pytorch:main Jul 10, 2024
59 of 64 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Algorithm] CrossQ #2033

[Algorithm] CrossQ #2033

BY571 commented Mar 21, 2024 •

edited

Loading

pytorch-bot bot commented Mar 21, 2024 •

edited

Loading

BY571 commented Mar 21, 2024

vmoens left a comment

vmoens commented Jun 12, 2024

vmoens Jun 27, 2024

vmoens Jun 27, 2024

vmoens Jun 27, 2024

vmoens Jul 9, 2024

vmoens commented Jul 9, 2024

BY571 commented Jul 9, 2024

vmoens left a comment

		import torch.nn as nn


		class BatchRenorm(nn.Module):

[Algorithm] CrossQ #2033

[Algorithm] CrossQ #2033

Conversation

BY571 commented Mar 21, 2024 • edited Loading

Description

Motivation and Context

Types of changes

Checklist

pytorch-bot bot commented Mar 21, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2033

❌ 4 New Failures, 1 Unrelated Failure

BY571 commented Mar 21, 2024

vmoens left a comment

Choose a reason for hiding this comment

vmoens commented Jun 12, 2024

vmoens Jun 27, 2024

Choose a reason for hiding this comment

vmoens Jun 27, 2024

Choose a reason for hiding this comment

vmoens Jun 27, 2024

Choose a reason for hiding this comment

vmoens Jul 9, 2024

Choose a reason for hiding this comment

vmoens commented Jul 9, 2024

BY571 commented Jul 9, 2024

vmoens left a comment

Choose a reason for hiding this comment

BY571 commented Mar 21, 2024 •

edited

Loading

pytorch-bot bot commented Mar 21, 2024 •

edited

Loading