Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] CatFrames.make_rb_transform_and_sampler #2643

Merged
merged 2 commits into from
Dec 13, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 11, 2024

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 11, 2024
ghstack-source-id: 11488a7c1d8ed1003148ff907d30195d153997f4
Pull Request resolved: #2643
Copy link

pytorch-bot bot commented Dec 11, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2643

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 17 Unrelated Failures

As of commit f3ff69c with merge base 17983d4 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 11, 2024
torchrl/data/replay_buffers/samplers.py Show resolved Hide resolved
torchrl/envs/transforms/transforms.py Show resolved Hide resolved
examples/replay-buffers/catframes-in-buffer.py Outdated Show resolved Hide resolved
examples/replay-buffers/catframes-in-buffer.py Outdated Show resolved Hide resolved
torchrl/data/replay_buffers/samplers.py Show resolved Hide resolved
torchrl/envs/transforms/transforms.py Outdated Show resolved Hide resolved
examples/replay-buffers/catframes-in-buffer.py Outdated Show resolved Hide resolved
transform=rb_transforms,
)

data = env.rollout(1000, break_when_any_done=False)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should perform 2k steps to be guaranteed to have 2 trajs (max is 1000 for DMC).
This will allow the addition of assertions that could be used in the doc to explain what is going on further (eg: As you can see, we check that the stacked frames for the frame at index 0 in the sampled batch is actually ...)
OR
Keep using 1000 in the example but for the unit tests I would make sure to have a test that ensures that CatFrames did not simply use the previous frame when that previous frame did not belong to the same traj.

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 13, 2024
ghstack-source-id: 7ecf952ec9f102a831aefdba533027ff8c4c29cc
Pull Request resolved: #2643
@vmoens vmoens merged commit f3ff69c into gh/vmoens/56/base Dec 13, 2024
48 of 59 checks passed
vmoens added a commit that referenced this pull request Dec 13, 2024
ghstack-source-id: 7ecf952ec9f102a831aefdba533027ff8c4c29cc
Pull Request resolved: #2643
@vmoens vmoens deleted the gh/vmoens/56/head branch December 13, 2024 17:04
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}5$. Worsened: $\large\color{#d91a1a}10$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4391s 0.4371s 2.2880 Ops/s 2.2627 Ops/s $\color{#35bf28}+1.12\%$
test_transformed 0.6160s 0.6131s 1.6311 Ops/s 1.6157 Ops/s $\color{#35bf28}+0.96\%$
test_serial 1.3620s 1.3560s 0.7375 Ops/s 0.7369 Ops/s $\color{#35bf28}+0.08\%$
test_parallel 1.4429s 1.3260s 0.7541 Ops/s 0.7612 Ops/s $\color{#d91a1a}-0.92\%$
test_step_mdp_speed[True-True-True-True-True] 0.2685ms 29.5602μs 33.8292 KOps/s 33.9871 KOps/s $\color{#d91a1a}-0.46\%$
test_step_mdp_speed[True-True-True-True-False] 57.6990μs 17.4617μs 57.2682 KOps/s 56.5656 KOps/s $\color{#35bf28}+1.24\%$
test_step_mdp_speed[True-True-True-False-True] 58.2390μs 16.6556μs 60.0400 KOps/s 59.4423 KOps/s $\color{#35bf28}+1.01\%$
test_step_mdp_speed[True-True-True-False-False] 59.8220μs 9.8609μs 101.4110 KOps/s 100.8762 KOps/s $\color{#35bf28}+0.53\%$
test_step_mdp_speed[True-True-False-True-True] 86.1920μs 31.7094μs 31.5364 KOps/s 31.2086 KOps/s $\color{#35bf28}+1.05\%$
test_step_mdp_speed[True-True-False-True-False] 0.4196ms 20.0496μs 49.8763 KOps/s 50.9187 KOps/s $\color{#d91a1a}-2.05\%$
test_step_mdp_speed[True-True-False-False-True] 58.5900μs 18.4611μs 54.1680 KOps/s 53.7483 KOps/s $\color{#35bf28}+0.78\%$
test_step_mdp_speed[True-True-False-False-False] 85.7100μs 11.6604μs 85.7605 KOps/s 85.5590 KOps/s $\color{#35bf28}+0.24\%$
test_step_mdp_speed[True-False-True-True-True] 77.6850μs 33.5791μs 29.7804 KOps/s 29.5735 KOps/s $\color{#35bf28}+0.70\%$
test_step_mdp_speed[True-False-True-True-False] 71.3220μs 21.1497μs 47.2819 KOps/s 46.8219 KOps/s $\color{#35bf28}+0.98\%$
test_step_mdp_speed[True-False-True-False-True] 67.7070μs 18.4601μs 54.1710 KOps/s 53.8832 KOps/s $\color{#35bf28}+0.53\%$
test_step_mdp_speed[True-False-True-False-False] 59.5210μs 11.6718μs 85.6767 KOps/s 85.2469 KOps/s $\color{#35bf28}+0.50\%$
test_step_mdp_speed[True-False-False-True-True] 94.4880μs 35.2870μs 28.3391 KOps/s 28.2692 KOps/s $\color{#35bf28}+0.25\%$
test_step_mdp_speed[True-False-False-True-False] 0.4225ms 23.0882μs 43.3123 KOps/s 43.7530 KOps/s $\color{#d91a1a}-1.01\%$
test_step_mdp_speed[True-False-False-False-True] 54.9030μs 20.1332μs 49.6692 KOps/s 49.5307 KOps/s $\color{#35bf28}+0.28\%$
test_step_mdp_speed[True-False-False-False-False] 44.2830μs 13.3177μs 75.0882 KOps/s 74.9525 KOps/s $\color{#35bf28}+0.18\%$
test_step_mdp_speed[False-True-True-True-True] 92.9940μs 33.1729μs 30.1451 KOps/s 30.1876 KOps/s $\color{#d91a1a}-0.14\%$
test_step_mdp_speed[False-True-True-True-False] 56.3360μs 21.2502μs 47.0583 KOps/s 46.9400 KOps/s $\color{#35bf28}+0.25\%$
test_step_mdp_speed[False-True-True-False-True] 59.5520μs 20.7710μs 48.1440 KOps/s 47.8014 KOps/s $\color{#35bf28}+0.72\%$
test_step_mdp_speed[False-True-True-False-False] 60.0630μs 12.8633μs 77.7407 KOps/s 77.4152 KOps/s $\color{#35bf28}+0.42\%$
test_step_mdp_speed[False-True-False-True-True] 81.7930μs 35.0346μs 28.5432 KOps/s 28.7143 KOps/s $\color{#d91a1a}-0.60\%$
test_step_mdp_speed[False-True-False-True-False] 63.5700μs 23.0065μs 43.4660 KOps/s 43.5071 KOps/s $\color{#d91a1a}-0.09\%$
test_step_mdp_speed[False-True-False-False-True] 2.8523ms 22.6940μs 44.0645 KOps/s 44.1101 KOps/s $\color{#d91a1a}-0.10\%$
test_step_mdp_speed[False-True-False-False-False] 55.6240μs 14.4568μs 69.1715 KOps/s 68.2349 KOps/s $\color{#35bf28}+1.37\%$
test_step_mdp_speed[False-False-True-True-True] 99.6470μs 37.1296μs 26.9327 KOps/s 25.2232 KOps/s $\textbf{\color{#35bf28}+6.78\%}$
test_step_mdp_speed[False-False-True-True-False] 52.3980μs 24.6111μs 40.6321 KOps/s 37.3877 KOps/s $\textbf{\color{#35bf28}+8.68\%}$
test_step_mdp_speed[False-False-True-False-True] 61.1550μs 22.4971μs 44.4503 KOps/s 44.3871 KOps/s $\color{#35bf28}+0.14\%$
test_step_mdp_speed[False-False-True-False-False] 45.9860μs 14.7210μs 67.9304 KOps/s 68.1483 KOps/s $\color{#d91a1a}-0.32\%$
test_step_mdp_speed[False-False-False-True-True] 75.3810μs 38.0732μs 26.2652 KOps/s 25.7387 KOps/s $\color{#35bf28}+2.05\%$
test_step_mdp_speed[False-False-False-True-False] 81.2220μs 26.3285μs 37.9816 KOps/s 38.5609 KOps/s $\color{#d91a1a}-1.50\%$
test_step_mdp_speed[False-False-False-False-True] 63.4990μs 23.9759μs 41.7085 KOps/s 42.1243 KOps/s $\color{#d91a1a}-0.99\%$
test_step_mdp_speed[False-False-False-False-False] 47.6290μs 16.1451μs 61.9381 KOps/s 61.5421 KOps/s $\color{#35bf28}+0.64\%$
test_values[generalized_advantage_estimate-True-True] 9.8610ms 9.5275ms 104.9596 Ops/s 104.7957 Ops/s $\color{#35bf28}+0.16\%$
test_values[vec_generalized_advantage_estimate-True-True] 35.7624ms 33.6168ms 29.7470 Ops/s 29.8027 Ops/s $\color{#d91a1a}-0.19\%$
test_values[td0_return_estimate-False-False] 0.2386ms 0.1823ms 5.4854 KOps/s 5.5728 KOps/s $\color{#d91a1a}-1.57\%$
test_values[td1_return_estimate-False-False] 44.0780ms 24.6041ms 40.6436 Ops/s 41.6553 Ops/s $\color{#d91a1a}-2.43\%$
test_values[vec_td1_return_estimate-False-False] 35.7133ms 33.7770ms 29.6059 Ops/s 29.8642 Ops/s $\color{#d91a1a}-0.86\%$
test_values[td_lambda_return_estimate-True-False] 37.8151ms 34.0789ms 29.3436 Ops/s 28.5990 Ops/s $\color{#35bf28}+2.60\%$
test_values[vec_td_lambda_return_estimate-True-False] 35.7810ms 33.7046ms 29.6695 Ops/s 29.3668 Ops/s $\color{#35bf28}+1.03\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 12.0838ms 8.2697ms 120.9227 Ops/s 119.0951 Ops/s $\color{#35bf28}+1.53\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.5570ms 1.9918ms 502.0639 Ops/s 506.9033 Ops/s $\color{#d91a1a}-0.95\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5063ms 0.3594ms 2.7822 KOps/s 2.7533 KOps/s $\color{#35bf28}+1.05\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 44.9395ms 44.2538ms 22.5970 Ops/s 24.1877 Ops/s $\textbf{\color{#d91a1a}-6.58\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.8366ms 3.1103ms 321.5130 Ops/s 329.9688 Ops/s $\color{#d91a1a}-2.56\%$
test_dqn_speed[False-None] 1.9189ms 1.3973ms 715.6491 Ops/s 727.8024 Ops/s $\color{#d91a1a}-1.67\%$
test_dqn_speed[False-backward] 1.9884ms 1.9064ms 524.5604 Ops/s 534.3864 Ops/s $\color{#d91a1a}-1.84\%$
test_dqn_speed[True-None] 0.5732ms 0.4712ms 2.1222 KOps/s 2.1034 KOps/s $\color{#35bf28}+0.90\%$
test_dqn_speed[True-backward] 0.9671ms 0.8811ms 1.1350 KOps/s 1.1212 KOps/s $\color{#35bf28}+1.23\%$
test_dqn_speed[reduce-overhead-None] 0.7519ms 0.4703ms 2.1262 KOps/s 2.0470 KOps/s $\color{#35bf28}+3.87\%$
test_dqn_speed[reduce-overhead-backward] 0.9791ms 0.8943ms 1.1182 KOps/s 1.1278 KOps/s $\color{#d91a1a}-0.86\%$
test_ddpg_speed[False-None] 3.6933ms 2.9230ms 342.1169 Ops/s 347.6155 Ops/s $\color{#d91a1a}-1.58\%$
test_ddpg_speed[False-backward] 4.5182ms 4.0586ms 246.3925 Ops/s 248.6717 Ops/s $\color{#d91a1a}-0.92\%$
test_ddpg_speed[True-None] 1.4991ms 1.0154ms 984.7874 Ops/s 992.4647 Ops/s $\color{#d91a1a}-0.77\%$
test_ddpg_speed[True-backward] 2.0311ms 1.9347ms 516.8807 Ops/s 530.1127 Ops/s $\color{#d91a1a}-2.50\%$
test_ddpg_speed[reduce-overhead-None] 1.3616ms 1.0127ms 987.4208 Ops/s 970.5911 Ops/s $\color{#35bf28}+1.73\%$
test_ddpg_speed[reduce-overhead-backward] 2.0523ms 1.9554ms 511.4052 Ops/s 531.8412 Ops/s $\color{#d91a1a}-3.84\%$
test_sac_speed[False-None] 10.0284ms 8.4002ms 119.0447 Ops/s 124.4214 Ops/s $\color{#d91a1a}-4.32\%$
test_sac_speed[False-backward] 12.0285ms 11.3659ms 87.9822 Ops/s 90.7374 Ops/s $\color{#d91a1a}-3.04\%$
test_sac_speed[True-None] 2.3906ms 1.8519ms 539.9865 Ops/s 549.1089 Ops/s $\color{#d91a1a}-1.66\%$
test_sac_speed[True-backward] 4.4091ms 3.6052ms 277.3791 Ops/s 284.2523 Ops/s $\color{#d91a1a}-2.42\%$
test_sac_speed[reduce-overhead-None] 2.1070ms 1.8454ms 541.9010 Ops/s 545.4848 Ops/s $\color{#d91a1a}-0.66\%$
test_sac_speed[reduce-overhead-backward] 3.5940ms 3.5184ms 284.2172 Ops/s 284.5844 Ops/s $\color{#d91a1a}-0.13\%$
test_redq_speed[False-None] 14.0350ms 12.9393ms 77.2841 Ops/s 77.8497 Ops/s $\color{#d91a1a}-0.73\%$
test_redq_speed[False-backward] 24.1224ms 22.3875ms 44.6679 Ops/s 45.2971 Ops/s $\color{#d91a1a}-1.39\%$
test_redq_speed[True-None] 5.2955ms 4.4806ms 223.1833 Ops/s 219.4606 Ops/s $\color{#35bf28}+1.70\%$
test_redq_speed[True-backward] 17.2061ms 12.5455ms 79.7097 Ops/s 83.4107 Ops/s $\color{#d91a1a}-4.44\%$
test_redq_speed[reduce-overhead-None] 5.9837ms 4.7784ms 209.2738 Ops/s 221.8438 Ops/s $\textbf{\color{#d91a1a}-5.67\%}$
test_redq_speed[reduce-overhead-backward] 13.2255ms 12.7629ms 78.3521 Ops/s 83.6634 Ops/s $\textbf{\color{#d91a1a}-6.35\%}$
test_redq_deprec_speed[False-None] 15.2162ms 13.4001ms 74.6261 Ops/s 76.9068 Ops/s $\color{#d91a1a}-2.97\%$
test_redq_deprec_speed[False-backward] 20.7514ms 19.5000ms 51.2820 Ops/s 52.9112 Ops/s $\color{#d91a1a}-3.08\%$
test_redq_deprec_speed[True-None] 4.2243ms 3.5743ms 279.7733 Ops/s 279.8913 Ops/s $\color{#d91a1a}-0.04\%$
test_redq_deprec_speed[True-backward] 10.0687ms 8.3136ms 120.2845 Ops/s 122.3629 Ops/s $\color{#d91a1a}-1.70\%$
test_redq_deprec_speed[reduce-overhead-None] 4.3172ms 3.5727ms 279.9015 Ops/s 268.3162 Ops/s $\color{#35bf28}+4.32\%$
test_redq_deprec_speed[reduce-overhead-backward] 8.9879ms 8.4588ms 118.2208 Ops/s 123.5579 Ops/s $\color{#d91a1a}-4.32\%$
test_td3_speed[False-None] 8.4958ms 8.0446ms 124.3075 Ops/s 122.2521 Ops/s $\color{#35bf28}+1.68\%$
test_td3_speed[False-backward] 10.6466ms 10.3433ms 96.6810 Ops/s 94.8957 Ops/s $\color{#35bf28}+1.88\%$
test_td3_speed[True-None] 2.2217ms 1.7269ms 579.0830 Ops/s 569.2703 Ops/s $\color{#35bf28}+1.72\%$
test_td3_speed[True-backward] 3.6906ms 3.4248ms 291.9873 Ops/s 293.6648 Ops/s $\color{#d91a1a}-0.57\%$
test_td3_speed[reduce-overhead-None] 2.0868ms 1.7164ms 582.6033 Ops/s 577.1264 Ops/s $\color{#35bf28}+0.95\%$
test_td3_speed[reduce-overhead-backward] 3.4573ms 3.3605ms 297.5777 Ops/s 292.7443 Ops/s $\color{#35bf28}+1.65\%$
test_cql_speed[False-None] 39.4142ms 36.1187ms 27.6865 Ops/s 26.0618 Ops/s $\textbf{\color{#35bf28}+6.23\%}$
test_cql_speed[False-backward] 59.1963ms 46.3245ms 21.5869 Ops/s 21.3912 Ops/s $\color{#35bf28}+0.91\%$
test_cql_speed[True-None] 16.8621ms 15.9259ms 62.7908 Ops/s 63.7457 Ops/s $\color{#d91a1a}-1.50\%$
test_cql_speed[True-backward] 23.3205ms 22.7280ms 43.9985 Ops/s 44.8777 Ops/s $\color{#d91a1a}-1.96\%$
test_cql_speed[reduce-overhead-None] 17.4485ms 16.0966ms 62.1249 Ops/s 64.5333 Ops/s $\color{#d91a1a}-3.73\%$
test_cql_speed[reduce-overhead-backward] 23.5924ms 22.8408ms 43.7814 Ops/s 43.3923 Ops/s $\color{#35bf28}+0.90\%$
test_a2c_speed[False-None] 9.5451ms 7.4602ms 134.0448 Ops/s 134.4464 Ops/s $\color{#d91a1a}-0.30\%$
test_a2c_speed[False-backward] 15.3643ms 15.0171ms 66.5906 Ops/s 65.0686 Ops/s $\color{#35bf28}+2.34\%$
test_a2c_speed[True-None] 5.7750ms 4.3437ms 230.2194 Ops/s 231.8252 Ops/s $\color{#d91a1a}-0.69\%$
test_a2c_speed[True-backward] 11.7286ms 11.0716ms 90.3215 Ops/s 82.7911 Ops/s $\textbf{\color{#35bf28}+9.10\%}$
test_a2c_speed[reduce-overhead-None] 4.6768ms 4.2408ms 235.8041 Ops/s 224.7138 Ops/s $\color{#35bf28}+4.94\%$
test_a2c_speed[reduce-overhead-backward] 13.7218ms 11.2212ms 89.1170 Ops/s 90.1843 Ops/s $\color{#d91a1a}-1.18\%$
test_ppo_speed[False-None] 8.4152ms 7.6218ms 131.2022 Ops/s 130.2293 Ops/s $\color{#35bf28}+0.75\%$
test_ppo_speed[False-backward] 16.9841ms 15.3864ms 64.9926 Ops/s 66.1853 Ops/s $\color{#d91a1a}-1.80\%$
test_ppo_speed[True-None] 4.4423ms 3.7717ms 265.1331 Ops/s 264.7966 Ops/s $\color{#35bf28}+0.13\%$
test_ppo_speed[True-backward] 10.9314ms 10.2727ms 97.3451 Ops/s 101.1079 Ops/s $\color{#d91a1a}-3.72\%$
test_ppo_speed[reduce-overhead-None] 4.5091ms 3.9374ms 253.9754 Ops/s 254.2176 Ops/s $\color{#d91a1a}-0.10\%$
test_ppo_speed[reduce-overhead-backward] 10.7202ms 10.1994ms 98.0445 Ops/s 100.7529 Ops/s $\color{#d91a1a}-2.69\%$
test_reinforce_speed[False-None] 7.6987ms 6.8068ms 146.9115 Ops/s 148.4803 Ops/s $\color{#d91a1a}-1.06\%$
test_reinforce_speed[False-backward] 11.3903ms 10.2441ms 97.6171 Ops/s 97.7030 Ops/s $\color{#d91a1a}-0.09\%$
test_reinforce_speed[True-None] 3.2550ms 2.7480ms 363.9000 Ops/s 375.2694 Ops/s $\color{#d91a1a}-3.03\%$
test_reinforce_speed[True-backward] 10.3590ms 9.1300ms 109.5285 Ops/s 117.2558 Ops/s $\textbf{\color{#d91a1a}-6.59\%}$
test_reinforce_speed[reduce-overhead-None] 3.1929ms 2.7235ms 367.1757 Ops/s 375.1323 Ops/s $\color{#d91a1a}-2.12\%$
test_reinforce_speed[reduce-overhead-backward] 9.4028ms 8.9953ms 111.1686 Ops/s 116.3333 Ops/s $\color{#d91a1a}-4.44\%$
test_iql_speed[False-None] 34.0611ms 33.0104ms 30.2935 Ops/s 30.3587 Ops/s $\color{#d91a1a}-0.21\%$
test_iql_speed[False-backward] 47.2133ms 46.2104ms 21.6402 Ops/s 21.3623 Ops/s $\color{#35bf28}+1.30\%$
test_iql_speed[True-None] 23.3223ms 11.3526ms 88.0853 Ops/s 89.5483 Ops/s $\color{#d91a1a}-1.63\%$
test_iql_speed[True-backward] 23.4063ms 22.3898ms 44.6632 Ops/s 44.6149 Ops/s $\color{#35bf28}+0.11\%$
test_iql_speed[reduce-overhead-None] 11.5916ms 10.9654ms 91.1959 Ops/s 90.4790 Ops/s $\color{#35bf28}+0.79\%$
test_iql_speed[reduce-overhead-backward] 26.7183ms 22.3704ms 44.7018 Ops/s 46.1281 Ops/s $\color{#d91a1a}-3.09\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.8753ms 5.2311ms 191.1656 Ops/s 197.4262 Ops/s $\color{#d91a1a}-3.17\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8255ms 0.5198ms 1.9238 KOps/s 1.9066 KOps/s $\color{#35bf28}+0.91\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7484ms 0.4960ms 2.0162 KOps/s 2.0394 KOps/s $\color{#d91a1a}-1.14\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.3457ms 4.9476ms 202.1191 Ops/s 209.4650 Ops/s $\color{#d91a1a}-3.51\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.3857s 0.8122ms 1.2312 KOps/s 1.9899 KOps/s $\textbf{\color{#d91a1a}-38.13\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8517ms 0.4941ms 2.0239 KOps/s 2.0625 KOps/s $\color{#d91a1a}-1.87\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.3893ms 1.6593ms 602.6804 Ops/s 603.6118 Ops/s $\color{#d91a1a}-0.15\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.3880ms 1.5869ms 630.1685 Ops/s 628.4071 Ops/s $\color{#35bf28}+0.28\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.5979ms 5.1263ms 195.0712 Ops/s 200.1604 Ops/s $\color{#d91a1a}-2.54\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0429ms 0.6614ms 1.5119 KOps/s 1.5322 KOps/s $\color{#d91a1a}-1.32\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9735ms 0.6393ms 1.5643 KOps/s 1.5920 KOps/s $\color{#d91a1a}-1.74\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.1003ms 5.0155ms 199.3809 Ops/s 208.2393 Ops/s $\color{#d91a1a}-4.25\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.7447ms 0.5351ms 1.8688 KOps/s 1.9259 KOps/s $\color{#d91a1a}-2.97\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7567ms 0.4988ms 2.0050 KOps/s 2.0072 KOps/s $\color{#d91a1a}-0.11\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.2717ms 4.9419ms 202.3510 Ops/s 213.1859 Ops/s $\textbf{\color{#d91a1a}-5.08\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.6356ms 0.5145ms 1.9437 KOps/s 2.0086 KOps/s $\color{#d91a1a}-3.23\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8488ms 0.4989ms 2.0044 KOps/s 2.1120 KOps/s $\textbf{\color{#d91a1a}-5.10\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.9821ms 5.1641ms 193.6434 Ops/s 203.6576 Ops/s $\color{#d91a1a}-4.92\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.1353ms 0.6727ms 1.4866 KOps/s 1.5244 KOps/s $\color{#d91a1a}-2.48\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9633ms 0.6441ms 1.5524 KOps/s 1.6197 KOps/s $\color{#d91a1a}-4.16\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.4689s 13.5291ms 73.9145 Ops/s 39.1484 Ops/s $\textbf{\color{#35bf28}+88.81\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.1611ms 2.4729ms 404.3898 Ops/s 435.9315 Ops/s $\textbf{\color{#d91a1a}-7.24\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 6.0213ms 1.3588ms 735.9241 Ops/s 822.1737 Ops/s $\textbf{\color{#d91a1a}-10.49\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 5.6948ms 4.2425ms 235.7102 Ops/s 233.8998 Ops/s $\color{#35bf28}+0.77\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 7.6200ms 2.3771ms 420.6834 Ops/s 440.2276 Ops/s $\color{#d91a1a}-4.44\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 5.4248ms 1.3464ms 742.6989 Ops/s 757.2856 Ops/s $\color{#d91a1a}-1.93\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4157s 12.6879ms 78.8153 Ops/s 237.2673 Ops/s $\textbf{\color{#d91a1a}-66.78\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 12.1308ms 2.5269ms 395.7444 Ops/s 392.3949 Ops/s $\color{#35bf28}+0.85\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.2386ms 1.4288ms 699.8929 Ops/s 671.2631 Ops/s $\color{#35bf28}+4.27\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 11.6063ms 11.2712ms 88.7215 Ops/s 88.8400 Ops/s $\color{#d91a1a}-0.13\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 16.8109ms 15.2486ms 65.5799 Ops/s 65.0561 Ops/s $\color{#35bf28}+0.81\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 22.2096ms 20.1583ms 49.6073 Ops/s 50.1792 Ops/s $\color{#d91a1a}-1.14\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.7903ms 15.3369ms 65.2024 Ops/s 65.2902 Ops/s $\color{#d91a1a}-0.13\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 20.5534ms 19.9898ms 50.0254 Ops/s 50.4099 Ops/s $\color{#d91a1a}-0.76\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 30.8291ms 17.1663ms 58.2536 Ops/s 59.1398 Ops/s $\color{#d91a1a}-1.50\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}22$. Worsened: $\large\color{#d91a1a}12$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7311s 0.7254s 1.3785 Ops/s 1.3856 Ops/s $\color{#d91a1a}-0.51\%$
test_transformed 0.9689s 0.9687s 1.0323 Ops/s 1.0330 Ops/s $\color{#d91a1a}-0.07\%$
test_serial 2.0869s 2.0823s 0.4802 Ops/s 0.4799 Ops/s $\color{#35bf28}+0.08\%$
test_parallel 1.9755s 1.9071s 0.5244 Ops/s 0.5161 Ops/s $\color{#35bf28}+1.61\%$
test_step_mdp_speed[True-True-True-True-True] 0.2356ms 38.8830μs 25.7182 KOps/s 25.1966 KOps/s $\color{#35bf28}+2.07\%$
test_step_mdp_speed[True-True-True-True-False] 0.1446ms 22.0970μs 45.2550 KOps/s 44.0015 KOps/s $\color{#35bf28}+2.85\%$
test_step_mdp_speed[True-True-True-False-True] 73.7710μs 21.5735μs 46.3533 KOps/s 47.4633 KOps/s $\color{#d91a1a}-2.34\%$
test_step_mdp_speed[True-True-True-False-False] 39.0910μs 12.3379μs 81.0512 KOps/s 80.0369 KOps/s $\color{#35bf28}+1.27\%$
test_step_mdp_speed[True-True-False-True-True] 0.1309ms 40.6975μs 24.5716 KOps/s 24.2810 KOps/s $\color{#35bf28}+1.20\%$
test_step_mdp_speed[True-True-False-True-False] 59.6110μs 23.7090μs 42.1782 KOps/s 41.5465 KOps/s $\color{#35bf28}+1.52\%$
test_step_mdp_speed[True-True-False-False-True] 52.1010μs 23.3688μs 42.7921 KOps/s 42.8931 KOps/s $\color{#d91a1a}-0.24\%$
test_step_mdp_speed[True-True-False-False-False] 80.8710μs 14.3595μs 69.6403 KOps/s 68.9767 KOps/s $\color{#35bf28}+0.96\%$
test_step_mdp_speed[True-False-True-True-True] 73.0010μs 42.9855μs 23.2637 KOps/s 23.3387 KOps/s $\color{#d91a1a}-0.32\%$
test_step_mdp_speed[True-False-True-True-False] 55.6410μs 26.2052μs 38.1604 KOps/s 38.3347 KOps/s $\color{#d91a1a}-0.45\%$
test_step_mdp_speed[True-False-True-False-True] 0.1131ms 23.6234μs 42.3310 KOps/s 44.0345 KOps/s $\color{#d91a1a}-3.87\%$
test_step_mdp_speed[True-False-True-False-False] 0.1062ms 14.3042μs 69.9094 KOps/s 69.5332 KOps/s $\color{#35bf28}+0.54\%$
test_step_mdp_speed[True-False-False-True-True] 77.8720μs 44.4837μs 22.4801 KOps/s 22.1100 KOps/s $\color{#35bf28}+1.67\%$
test_step_mdp_speed[True-False-False-True-False] 64.2410μs 27.8065μs 35.9629 KOps/s 35.2856 KOps/s $\color{#35bf28}+1.92\%$
test_step_mdp_speed[True-False-False-False-True] 59.1910μs 25.7761μs 38.7956 KOps/s 39.2397 KOps/s $\color{#d91a1a}-1.13\%$
test_step_mdp_speed[True-False-False-False-False] 48.6510μs 16.2012μs 61.7236 KOps/s 61.1637 KOps/s $\color{#35bf28}+0.92\%$
test_step_mdp_speed[False-True-True-True-True] 77.0120μs 42.3597μs 23.6074 KOps/s 23.8114 KOps/s $\color{#d91a1a}-0.86\%$
test_step_mdp_speed[False-True-True-True-False] 52.1810μs 25.9704μs 38.5054 KOps/s 38.8938 KOps/s $\color{#d91a1a}-1.00\%$
test_step_mdp_speed[False-True-True-False-True] 61.1910μs 26.8783μs 37.2047 KOps/s 36.9524 KOps/s $\color{#35bf28}+0.68\%$
test_step_mdp_speed[False-True-True-False-False] 66.0010μs 15.5316μs 64.3849 KOps/s 61.5778 KOps/s $\color{#35bf28}+4.56\%$
test_step_mdp_speed[False-True-False-True-True] 0.1378ms 44.7967μs 22.3231 KOps/s 22.3047 KOps/s $\color{#35bf28}+0.08\%$
test_step_mdp_speed[False-True-False-True-False] 71.4520μs 28.2099μs 35.4485 KOps/s 35.4897 KOps/s $\color{#d91a1a}-0.12\%$
test_step_mdp_speed[False-True-False-False-True] 3.1855ms 29.1586μs 34.2952 KOps/s 34.8906 KOps/s $\color{#d91a1a}-1.71\%$
test_step_mdp_speed[False-True-False-False-False] 0.1671ms 18.0970μs 55.2579 KOps/s 55.3090 KOps/s $\color{#d91a1a}-0.09\%$
test_step_mdp_speed[False-False-True-True-True] 0.1218ms 47.5238μs 21.0421 KOps/s 21.4494 KOps/s $\color{#d91a1a}-1.90\%$
test_step_mdp_speed[False-False-True-True-False] 62.4210μs 30.4673μs 32.8221 KOps/s 33.0337 KOps/s $\color{#d91a1a}-0.64\%$
test_step_mdp_speed[False-False-True-False-True] 0.2291ms 28.9986μs 34.4845 KOps/s 34.7324 KOps/s $\color{#d91a1a}-0.71\%$
test_step_mdp_speed[False-False-True-False-False] 58.3310μs 17.8718μs 55.9541 KOps/s 55.1523 KOps/s $\color{#35bf28}+1.45\%$
test_step_mdp_speed[False-False-False-True-True] 80.4320μs 48.2471μs 20.7266 KOps/s 20.7012 KOps/s $\color{#35bf28}+0.12\%$
test_step_mdp_speed[False-False-False-True-False] 60.0310μs 32.4009μs 30.8633 KOps/s 31.1200 KOps/s $\color{#d91a1a}-0.82\%$
test_step_mdp_speed[False-False-False-False-True] 81.0010μs 29.7653μs 33.5961 KOps/s 33.4286 KOps/s $\color{#35bf28}+0.50\%$
test_step_mdp_speed[False-False-False-False-False] 51.2610μs 19.8210μs 50.4517 KOps/s 49.7432 KOps/s $\color{#35bf28}+1.42\%$
test_values[generalized_advantage_estimate-True-True] 25.3205ms 24.1497ms 41.4084 Ops/s 42.4735 Ops/s $\color{#d91a1a}-2.51\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1126s 3.1324ms 319.2487 Ops/s 325.9051 Ops/s $\color{#d91a1a}-2.04\%$
test_values[td0_return_estimate-False-False] 0.1007ms 78.0304μs 12.8155 KOps/s 12.7266 KOps/s $\color{#35bf28}+0.70\%$
test_values[td1_return_estimate-False-False] 56.9670ms 53.1039ms 18.8310 Ops/s 19.1168 Ops/s $\color{#d91a1a}-1.50\%$
test_values[vec_td1_return_estimate-False-False] 1.2849ms 1.0763ms 929.0859 Ops/s 949.9034 Ops/s $\color{#d91a1a}-2.19\%$
test_values[td_lambda_return_estimate-True-False] 84.5112ms 83.7582ms 11.9391 Ops/s 12.0920 Ops/s $\color{#d91a1a}-1.26\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.1933ms 1.0530ms 949.6877 Ops/s 946.6592 Ops/s $\color{#35bf28}+0.32\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 23.9559ms 23.4625ms 42.6212 Ops/s 42.7686 Ops/s $\color{#d91a1a}-0.34\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0435ms 0.7260ms 1.3774 KOps/s 1.3536 KOps/s $\color{#35bf28}+1.76\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7980ms 0.6399ms 1.5627 KOps/s 1.5445 KOps/s $\color{#35bf28}+1.17\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6052ms 1.4573ms 686.1913 Ops/s 687.4215 Ops/s $\color{#d91a1a}-0.18\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8202ms 0.6790ms 1.4727 KOps/s 1.5213 KOps/s $\color{#d91a1a}-3.20\%$
test_dqn_speed[False-None] 7.0660ms 1.4749ms 678.0306 Ops/s 691.1571 Ops/s $\color{#d91a1a}-1.90\%$
test_dqn_speed[False-backward] 2.1972ms 2.0422ms 489.6634 Ops/s 493.0044 Ops/s $\color{#d91a1a}-0.68\%$
test_dqn_speed[True-None] 0.6921ms 0.5102ms 1.9599 KOps/s 1.9270 KOps/s $\color{#35bf28}+1.71\%$
test_dqn_speed[True-backward] 1.0967ms 1.0497ms 952.6834 Ops/s 846.2233 Ops/s $\textbf{\color{#35bf28}+12.58\%}$
test_dqn_speed[reduce-overhead-None] 0.6973ms 0.5268ms 1.8983 KOps/s 1.8580 KOps/s $\color{#35bf28}+2.17\%$
test_dqn_speed[reduce-overhead-backward] 1.1140ms 0.9254ms 1.0806 KOps/s 956.3881 Ops/s $\textbf{\color{#35bf28}+12.99\%}$
test_ddpg_speed[False-None] 3.0510ms 2.7365ms 365.4340 Ops/s 362.5306 Ops/s $\color{#35bf28}+0.80\%$
test_ddpg_speed[False-backward] 4.3289ms 3.8958ms 256.6877 Ops/s 244.8136 Ops/s $\color{#35bf28}+4.85\%$
test_ddpg_speed[True-None] 1.1862ms 1.0254ms 975.1962 Ops/s 960.5485 Ops/s $\color{#35bf28}+1.52\%$
test_ddpg_speed[True-backward] 2.1863ms 2.0526ms 487.1934 Ops/s 438.3030 Ops/s $\textbf{\color{#35bf28}+11.15\%}$
test_ddpg_speed[reduce-overhead-None] 1.2064ms 1.0362ms 965.0469 Ops/s 906.3007 Ops/s $\textbf{\color{#35bf28}+6.48\%}$
test_ddpg_speed[reduce-overhead-backward] 1.7277ms 1.5615ms 640.4295 Ops/s 573.4690 Ops/s $\textbf{\color{#35bf28}+11.68\%}$
test_sac_speed[False-None] 8.1567ms 7.7150ms 129.6178 Ops/s 126.7847 Ops/s $\color{#35bf28}+2.23\%$
test_sac_speed[False-backward] 11.0747ms 10.4552ms 95.6461 Ops/s 92.1945 Ops/s $\color{#35bf28}+3.74\%$
test_sac_speed[True-None] 1.6389ms 1.4695ms 680.5113 Ops/s 665.5730 Ops/s $\color{#35bf28}+2.24\%$
test_sac_speed[True-backward] 3.2974ms 3.0934ms 323.2657 Ops/s 307.5542 Ops/s $\textbf{\color{#35bf28}+5.11\%}$
test_sac_speed[reduce-overhead-None] 22.5407ms 12.5171ms 79.8908 Ops/s 79.2258 Ops/s $\color{#35bf28}+0.84\%$
test_sac_speed[reduce-overhead-backward] 1.4385ms 1.2930ms 773.3889 Ops/s 755.8689 Ops/s $\color{#35bf28}+2.32\%$
test_redq_speed[False-None] 7.9829ms 7.1897ms 139.0870 Ops/s 135.6588 Ops/s $\color{#35bf28}+2.53\%$
test_redq_speed[False-backward] 11.3896ms 10.7332ms 93.1685 Ops/s 91.1394 Ops/s $\color{#35bf28}+2.23\%$
test_redq_speed[True-None] 2.0890ms 1.9067ms 524.4791 Ops/s 524.5981 Ops/s $\color{#d91a1a}-0.02\%$
test_redq_speed[True-backward] 3.6150ms 3.4694ms 288.2355 Ops/s 269.6568 Ops/s $\textbf{\color{#35bf28}+6.89\%}$
test_redq_speed[reduce-overhead-None] 2.0932ms 1.9030ms 525.4795 Ops/s 526.4771 Ops/s $\color{#d91a1a}-0.19\%$
test_redq_speed[reduce-overhead-backward] 3.6138ms 3.4696ms 288.2170 Ops/s 269.4810 Ops/s $\textbf{\color{#35bf28}+6.95\%}$
test_redq_deprec_speed[False-None] 9.2531ms 8.6959ms 114.9972 Ops/s 115.7057 Ops/s $\color{#d91a1a}-0.61\%$
test_redq_deprec_speed[False-backward] 11.9891ms 11.4747ms 87.1484 Ops/s 84.9052 Ops/s $\color{#35bf28}+2.64\%$
test_redq_deprec_speed[True-None] 2.6281ms 2.2374ms 446.9516 Ops/s 420.6536 Ops/s $\textbf{\color{#35bf28}+6.25\%}$
test_redq_deprec_speed[True-backward] 4.0997ms 3.8068ms 262.6845 Ops/s 255.3025 Ops/s $\color{#35bf28}+2.89\%$
test_redq_deprec_speed[reduce-overhead-None] 2.4089ms 2.2347ms 447.4836 Ops/s 433.0785 Ops/s $\color{#35bf28}+3.33\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.6530ms 3.8065ms 262.7078 Ops/s 256.8177 Ops/s $\color{#35bf28}+2.29\%$
test_td3_speed[False-None] 7.7906ms 7.7063ms 129.7639 Ops/s 121.9594 Ops/s $\textbf{\color{#35bf28}+6.40\%}$
test_td3_speed[False-backward] 10.4497ms 9.9230ms 100.7765 Ops/s 100.5335 Ops/s $\color{#35bf28}+0.24\%$
test_td3_speed[True-None] 1.5665ms 1.5084ms 662.9455 Ops/s 660.6663 Ops/s $\color{#35bf28}+0.34\%$
test_td3_speed[True-backward] 3.1329ms 2.9481ms 339.2031 Ops/s 330.4573 Ops/s $\color{#35bf28}+2.65\%$
test_td3_speed[reduce-overhead-None] 77.4193ms 24.6917ms 40.4995 Ops/s 37.9736 Ops/s $\textbf{\color{#35bf28}+6.65\%}$
test_td3_speed[reduce-overhead-backward] 1.6623ms 1.2658ms 790.0039 Ops/s 796.0109 Ops/s $\color{#d91a1a}-0.75\%$
test_cql_speed[False-None] 16.2990ms 15.7909ms 63.3274 Ops/s 63.2031 Ops/s $\color{#35bf28}+0.20\%$
test_cql_speed[False-backward] 20.8112ms 20.2767ms 49.3177 Ops/s 48.3245 Ops/s $\color{#35bf28}+2.06\%$
test_cql_speed[True-None] 2.9829ms 2.7841ms 359.1885 Ops/s 351.0145 Ops/s $\color{#35bf28}+2.33\%$
test_cql_speed[True-backward] 5.1819ms 4.8249ms 207.2600 Ops/s 202.8413 Ops/s $\color{#35bf28}+2.18\%$
test_cql_speed[reduce-overhead-None] 20.5456ms 12.7052ms 78.7079 Ops/s 59.5202 Ops/s $\textbf{\color{#35bf28}+32.24\%}$
test_cql_speed[reduce-overhead-backward] 1.6312ms 1.4399ms 694.4730 Ops/s 678.5918 Ops/s $\color{#35bf28}+2.34\%$
test_a2c_speed[False-None] 3.5312ms 3.0946ms 323.1470 Ops/s 321.1628 Ops/s $\color{#35bf28}+0.62\%$
test_a2c_speed[False-backward] 6.6256ms 5.8542ms 170.8186 Ops/s 169.9743 Ops/s $\color{#35bf28}+0.50\%$
test_a2c_speed[True-None] 1.3784ms 0.9858ms 1.0144 KOps/s 1.0306 KOps/s $\color{#d91a1a}-1.57\%$
test_a2c_speed[True-backward] 2.5853ms 2.4737ms 404.2536 Ops/s 387.9436 Ops/s $\color{#35bf28}+4.20\%$
test_a2c_speed[reduce-overhead-None] 21.3352ms 11.5545ms 86.5464 Ops/s 86.5035 Ops/s $\color{#35bf28}+0.05\%$
test_a2c_speed[reduce-overhead-backward] 1.0921ms 0.9598ms 1.0418 KOps/s 1.0429 KOps/s $\color{#d91a1a}-0.10\%$
test_ppo_speed[False-None] 3.7336ms 3.4870ms 286.7790 Ops/s 281.9423 Ops/s $\color{#35bf28}+1.72\%$
test_ppo_speed[False-backward] 6.8726ms 6.3930ms 156.4214 Ops/s 152.8089 Ops/s $\color{#35bf28}+2.36\%$
test_ppo_speed[True-None] 1.0709ms 0.8979ms 1.1137 KOps/s 1.0866 KOps/s $\color{#35bf28}+2.49\%$
test_ppo_speed[True-backward] 2.5323ms 2.4076ms 415.3487 Ops/s 405.6751 Ops/s $\color{#35bf28}+2.38\%$
test_ppo_speed[reduce-overhead-None] 0.6744ms 0.4814ms 2.0774 KOps/s 1.9479 KOps/s $\textbf{\color{#35bf28}+6.65\%}$
test_ppo_speed[reduce-overhead-backward] 0.9551ms 0.9164ms 1.0912 KOps/s 1.0467 KOps/s $\color{#35bf28}+4.25\%$
test_reinforce_speed[False-None] 2.3313ms 2.1685ms 461.1407 Ops/s 455.8580 Ops/s $\color{#35bf28}+1.16\%$
test_reinforce_speed[False-backward] 3.5914ms 3.1444ms 318.0304 Ops/s 317.5897 Ops/s $\color{#35bf28}+0.14\%$
test_reinforce_speed[True-None] 1.2358ms 0.8090ms 1.2360 KOps/s 1.2146 KOps/s $\color{#35bf28}+1.76\%$
test_reinforce_speed[True-backward] 2.7022ms 2.3395ms 427.4373 Ops/s 422.1349 Ops/s $\color{#35bf28}+1.26\%$
test_reinforce_speed[reduce-overhead-None] 21.3408ms 11.2887ms 88.5840 Ops/s 87.0299 Ops/s $\color{#35bf28}+1.79\%$
test_reinforce_speed[reduce-overhead-backward] 1.0715ms 1.0045ms 995.5004 Ops/s 983.5141 Ops/s $\color{#35bf28}+1.22\%$
test_iql_speed[False-None] 9.4702ms 9.0138ms 110.9412 Ops/s 107.4977 Ops/s $\color{#35bf28}+3.20\%$
test_iql_speed[False-backward] 13.1150ms 12.4489ms 80.3282 Ops/s 78.3807 Ops/s $\color{#35bf28}+2.48\%$
test_iql_speed[True-None] 2.2943ms 1.7120ms 584.1090 Ops/s 588.9970 Ops/s $\color{#d91a1a}-0.83\%$
test_iql_speed[True-backward] 4.3468ms 4.0337ms 247.9086 Ops/s 241.8879 Ops/s $\color{#35bf28}+2.49\%$
test_iql_speed[reduce-overhead-None] 19.2559ms 11.0974ms 90.1113 Ops/s 115.3669 Ops/s $\textbf{\color{#d91a1a}-21.89\%}$
test_iql_speed[reduce-overhead-backward] 1.5082ms 1.3776ms 725.9206 Ops/s 731.3100 Ops/s $\color{#d91a1a}-0.74\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 8.1201ms 6.2341ms 160.4077 Ops/s 158.9910 Ops/s $\color{#35bf28}+0.89\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8248ms 0.3091ms 3.2353 KOps/s 2.8480 KOps/s $\textbf{\color{#35bf28}+13.60\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5684ms 0.3115ms 3.2100 KOps/s 3.3068 KOps/s $\color{#d91a1a}-2.93\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4822ms 6.0182ms 166.1631 Ops/s 166.4442 Ops/s $\color{#d91a1a}-0.17\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.0005ms 0.2974ms 3.3624 KOps/s 2.7914 KOps/s $\textbf{\color{#35bf28}+20.46\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6605ms 0.2804ms 3.5665 KOps/s 3.6449 KOps/s $\color{#d91a1a}-2.15\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6976ms 1.2663ms 789.7303 Ops/s 713.7404 Ops/s $\textbf{\color{#35bf28}+10.65\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5018ms 1.1997ms 833.5444 Ops/s 765.2958 Ops/s $\textbf{\color{#35bf28}+8.92\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5145ms 6.2176ms 160.8330 Ops/s 161.3467 Ops/s $\color{#d91a1a}-0.32\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8004ms 0.4969ms 2.0124 KOps/s 2.1803 KOps/s $\textbf{\color{#d91a1a}-7.70\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7324ms 0.4918ms 2.0333 KOps/s 2.4830 KOps/s $\textbf{\color{#d91a1a}-18.11\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.4151ms 6.0504ms 165.2794 Ops/s 166.2817 Ops/s $\color{#d91a1a}-0.60\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7516ms 0.3528ms 2.8343 KOps/s 3.4798 KOps/s $\textbf{\color{#d91a1a}-18.55\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6500ms 0.3430ms 2.9153 KOps/s 3.0658 KOps/s $\color{#d91a1a}-4.91\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.6141ms 6.0289ms 165.8676 Ops/s 167.5150 Ops/s $\color{#d91a1a}-0.98\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.4695ms 0.3585ms 2.7893 KOps/s 3.0081 KOps/s $\textbf{\color{#d91a1a}-7.27\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7201ms 0.2479ms 4.0343 KOps/s 3.3290 KOps/s $\textbf{\color{#35bf28}+21.19\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.6163ms 6.1766ms 161.9019 Ops/s 162.1938 Ops/s $\color{#d91a1a}-0.18\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.4791ms 0.5007ms 1.9970 KOps/s 2.1494 KOps/s $\textbf{\color{#d91a1a}-7.09\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7853ms 0.4858ms 2.0583 KOps/s 2.1893 KOps/s $\textbf{\color{#d91a1a}-5.98\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.2073ms 5.4499ms 183.4888 Ops/s 194.0942 Ops/s $\textbf{\color{#d91a1a}-5.46\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 10.5016ms 2.0834ms 479.9917 Ops/s 516.8257 Ops/s $\textbf{\color{#d91a1a}-7.13\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 6.0753ms 1.2271ms 814.9204 Ops/s 828.5663 Ops/s $\color{#d91a1a}-1.65\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.1504ms 5.4234ms 184.3848 Ops/s 194.9145 Ops/s $\textbf{\color{#d91a1a}-5.40\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 5.3716ms 1.9965ms 500.8651 Ops/s 428.8660 Ops/s $\textbf{\color{#35bf28}+16.79\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 0.5270s 11.8704ms 84.2432 Ops/s 831.8136 Ops/s $\textbf{\color{#d91a1a}-89.87\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 13.8465ms 5.8249ms 171.6755 Ops/s 30.9586 Ops/s $\textbf{\color{#35bf28}+454.53\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 8.6953ms 2.1608ms 462.7855 Ops/s 431.2964 Ops/s $\textbf{\color{#35bf28}+7.30\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 8.6499ms 1.3986ms 715.0010 Ops/s 760.8430 Ops/s $\textbf{\color{#d91a1a}-6.03\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.7470ms 13.0373ms 76.7028 Ops/s 79.0154 Ops/s $\color{#d91a1a}-2.93\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.3636ms 16.9574ms 58.9714 Ops/s 57.3668 Ops/s $\color{#35bf28}+2.80\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 17.7072ms 17.1374ms 58.3518 Ops/s 55.2137 Ops/s $\textbf{\color{#35bf28}+5.68\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.3208ms 17.2564ms 57.9495 Ops/s 56.6820 Ops/s $\color{#35bf28}+2.24\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 18.0011ms 17.4553ms 57.2892 Ops/s 55.9528 Ops/s $\color{#35bf28}+2.39\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 20.4942ms 18.9369ms 52.8068 Ops/s 52.0993 Ops/s $\color{#35bf28}+1.36\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants