Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] TD3 compatibility with compile #2656

Closed
wants to merge 19 commits into from

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 16, 2024

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Dec 16, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2656

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 7 Unrelated Failures

As of commit 517aec3 with merge base 187de7c (image):

NEW FAILURE - The following job has failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 16, 2024
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: 51222d355d9b3a0b900d987d216d2ad2f1fb0bd2
Pull Request resolved: #2656
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: 4868973ab36ce1dfa230f299f54e600d002f2900
Pull Request resolved: #2656
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: 6531c2f1c1052618ba71f2edb18c9c68876891fc
Pull Request resolved: #2656
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: 7a55d4aa03544921f022402ff0d81a10bfff38b0
Pull Request resolved: #2656
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: 6b2f00c347c34bf85b36bab05e66b8e9f4bf7d62
Pull Request resolved: #2656
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: f36fc0962ba4c8ee6e649beaa450c96e1b58f897
Pull Request resolved: #2656
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: 7d637c6ce09b850a5161cb0066b3bf8e065b7406
Pull Request resolved: #2656
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: bf4ac88e13e30edf83f34cd838f3a82d323411ba
Pull Request resolved: #2656
[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens closed this Dec 16, 2024
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}38$. Worsened: $\large\color{#d91a1a}12$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4338s 0.4309s 2.3207 Ops/s 2.2164 Ops/s $\color{#35bf28}+4.70\%$
test_transformed 0.6114s 0.6054s 1.6517 Ops/s 1.5788 Ops/s $\color{#35bf28}+4.62\%$
test_serial 1.3501s 1.3435s 0.7443 Ops/s 0.7248 Ops/s $\color{#35bf28}+2.70\%$
test_parallel 1.2983s 1.2927s 0.7736 Ops/s 0.7546 Ops/s $\color{#35bf28}+2.51\%$
test_step_mdp_speed[True-True-True-True-True] 0.2496ms 29.2098μs 34.2351 KOps/s 33.2874 KOps/s $\color{#35bf28}+2.85\%$
test_step_mdp_speed[True-True-True-True-False] 48.4400μs 17.2766μs 57.8817 KOps/s 57.3935 KOps/s $\color{#35bf28}+0.85\%$
test_step_mdp_speed[True-True-True-False-True] 0.2848ms 17.2041μs 58.1258 KOps/s 59.1200 KOps/s $\color{#d91a1a}-1.68\%$
test_step_mdp_speed[True-True-True-False-False] 37.4700μs 9.6600μs 103.5196 KOps/s 101.1685 KOps/s $\color{#35bf28}+2.32\%$
test_step_mdp_speed[True-True-False-True-True] 68.6180μs 31.6792μs 31.5664 KOps/s 31.4553 KOps/s $\color{#35bf28}+0.35\%$
test_step_mdp_speed[True-True-False-True-False] 55.1120μs 19.0421μs 52.5153 KOps/s 52.0192 KOps/s $\color{#35bf28}+0.95\%$
test_step_mdp_speed[True-True-False-False-True] 73.9210μs 18.3101μs 54.6145 KOps/s 53.7628 KOps/s $\color{#35bf28}+1.58\%$
test_step_mdp_speed[True-True-False-False-False] 41.8080μs 11.4184μs 87.5782 KOps/s 85.8570 KOps/s $\color{#35bf28}+2.00\%$
test_step_mdp_speed[True-False-True-True-True] 0.1013ms 33.2063μs 30.1148 KOps/s 29.8456 KOps/s $\color{#35bf28}+0.90\%$
test_step_mdp_speed[True-False-True-True-False] 0.2393ms 21.0205μs 47.5727 KOps/s 47.7792 KOps/s $\color{#d91a1a}-0.43\%$
test_step_mdp_speed[True-False-True-False-True] 82.2740μs 18.2855μs 54.6883 KOps/s 53.8125 KOps/s $\color{#35bf28}+1.63\%$
test_step_mdp_speed[True-False-True-False-False] 40.5460μs 11.4908μs 87.0258 KOps/s 86.2019 KOps/s $\color{#35bf28}+0.96\%$
test_step_mdp_speed[True-False-False-True-True] 91.7720μs 34.7923μs 28.7420 KOps/s 28.5094 KOps/s $\color{#35bf28}+0.82\%$
test_step_mdp_speed[True-False-False-True-False] 84.2970μs 22.3601μs 44.7225 KOps/s 44.6596 KOps/s $\color{#35bf28}+0.14\%$
test_step_mdp_speed[True-False-False-False-True] 64.6210μs 19.9598μs 50.1008 KOps/s 49.4229 KOps/s $\color{#35bf28}+1.37\%$
test_step_mdp_speed[True-False-False-False-False] 98.0130μs 13.1556μs 76.0135 KOps/s 75.8767 KOps/s $\color{#35bf28}+0.18\%$
test_step_mdp_speed[False-True-True-True-True] 0.6420ms 33.1603μs 30.1566 KOps/s 30.0477 KOps/s $\color{#35bf28}+0.36\%$
test_step_mdp_speed[False-True-True-True-False] 78.3260μs 21.0390μs 47.5307 KOps/s 47.9753 KOps/s $\color{#d91a1a}-0.93\%$
test_step_mdp_speed[False-True-True-False-True] 54.7630μs 20.9132μs 47.8167 KOps/s 47.1910 KOps/s $\color{#35bf28}+1.33\%$
test_step_mdp_speed[False-True-True-False-False] 61.8660μs 13.0077μs 76.8777 KOps/s 78.1346 KOps/s $\color{#d91a1a}-1.61\%$
test_step_mdp_speed[False-True-False-True-True] 80.1700μs 35.0843μs 28.5027 KOps/s 28.1130 KOps/s $\color{#35bf28}+1.39\%$
test_step_mdp_speed[False-True-False-True-False] 82.4640μs 22.5315μs 44.3824 KOps/s 44.1466 KOps/s $\color{#35bf28}+0.53\%$
test_step_mdp_speed[False-True-False-False-True] 2.8729ms 23.1220μs 43.2488 KOps/s 43.6897 KOps/s $\color{#d91a1a}-1.01\%$
test_step_mdp_speed[False-True-False-False-False] 47.5890μs 14.4654μs 69.1306 KOps/s 67.6304 KOps/s $\color{#35bf28}+2.22\%$
test_step_mdp_speed[False-False-True-True-True] 0.1039ms 36.5332μs 27.3723 KOps/s 27.3954 KOps/s $\color{#d91a1a}-0.08\%$
test_step_mdp_speed[False-False-True-True-False] 65.7530μs 24.1986μs 41.3247 KOps/s 41.2369 KOps/s $\color{#35bf28}+0.21\%$
test_step_mdp_speed[False-False-True-False-True] 78.2760μs 22.6360μs 44.1774 KOps/s 43.5790 KOps/s $\color{#35bf28}+1.37\%$
test_step_mdp_speed[False-False-True-False-False] 56.3350μs 14.5621μs 68.6712 KOps/s 68.3851 KOps/s $\color{#35bf28}+0.42\%$
test_step_mdp_speed[False-False-False-True-True] 0.1315ms 38.1370μs 26.2212 KOps/s 26.3155 KOps/s $\color{#d91a1a}-0.36\%$
test_step_mdp_speed[False-False-False-True-False] 88.4780μs 26.1281μs 38.2729 KOps/s 38.9914 KOps/s $\color{#d91a1a}-1.84\%$
test_step_mdp_speed[False-False-False-False-True] 66.2730μs 23.9618μs 41.7330 KOps/s 41.8634 KOps/s $\color{#d91a1a}-0.31\%$
test_step_mdp_speed[False-False-False-False-False] 52.4280μs 16.0621μs 62.2584 KOps/s 61.4320 KOps/s $\color{#35bf28}+1.35\%$
test_values[generalized_advantage_estimate-True-True] 10.7192ms 9.8244ms 101.7874 Ops/s 104.7227 Ops/s $\color{#d91a1a}-2.80\%$
test_values[vec_generalized_advantage_estimate-True-True] 37.5887ms 34.2825ms 29.1694 Ops/s 29.7874 Ops/s $\color{#d91a1a}-2.07\%$
test_values[td0_return_estimate-False-False] 0.2629ms 0.2182ms 4.5833 KOps/s 5.4258 KOps/s $\textbf{\color{#d91a1a}-15.53\%}$
test_values[td1_return_estimate-False-False] 25.3539ms 24.2102ms 41.3050 Ops/s 41.7643 Ops/s $\color{#d91a1a}-1.10\%$
test_values[vec_td1_return_estimate-False-False] 35.8834ms 33.8278ms 29.5615 Ops/s 29.7419 Ops/s $\color{#d91a1a}-0.61\%$
test_values[td_lambda_return_estimate-True-False] 37.6530ms 34.8689ms 28.6789 Ops/s 29.3697 Ops/s $\color{#d91a1a}-2.35\%$
test_values[vec_td_lambda_return_estimate-True-False] 35.4266ms 33.7890ms 29.5955 Ops/s 29.2675 Ops/s $\color{#35bf28}+1.12\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.4246ms 8.2329ms 121.4637 Ops/s 119.2593 Ops/s $\color{#35bf28}+1.85\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.4521ms 2.0588ms 485.7139 Ops/s 494.3588 Ops/s $\color{#d91a1a}-1.75\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 1.6617ms 0.3645ms 2.7437 KOps/s 2.7318 KOps/s $\color{#35bf28}+0.44\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 47.7388ms 45.7555ms 21.8553 Ops/s 20.1337 Ops/s $\textbf{\color{#35bf28}+8.55\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.8400ms 3.0664ms 326.1129 Ops/s 326.9130 Ops/s $\color{#d91a1a}-0.24\%$
test_dqn_speed[False-None] 1.9558ms 1.3860ms 721.5183 Ops/s 722.2135 Ops/s $\color{#d91a1a}-0.10\%$
test_dqn_speed[False-backward] 1.9610ms 1.9031ms 525.4586 Ops/s 536.6180 Ops/s $\color{#d91a1a}-2.08\%$
test_dqn_speed[True-None] 0.8321ms 0.4693ms 2.1307 KOps/s 2.1417 KOps/s $\color{#d91a1a}-0.52\%$
test_dqn_speed[True-backward] 0.9693ms 0.9032ms 1.1072 KOps/s 1.1140 KOps/s $\color{#d91a1a}-0.61\%$
test_dqn_speed[reduce-overhead-None] 0.6205ms 0.4637ms 2.1567 KOps/s 2.1507 KOps/s $\color{#35bf28}+0.28\%$
test_dqn_speed[reduce-overhead-backward] 0.9579ms 0.8979ms 1.1137 KOps/s 1.1023 KOps/s $\color{#35bf28}+1.03\%$
test_ddpg_speed[False-None] 3.2192ms 2.8923ms 345.7515 Ops/s 347.3850 Ops/s $\color{#d91a1a}-0.47\%$
test_ddpg_speed[False-backward] 4.4049ms 4.1021ms 243.7764 Ops/s 248.6836 Ops/s $\color{#d91a1a}-1.97\%$
test_ddpg_speed[True-None] 1.4424ms 0.9970ms 1.0031 KOps/s 991.9016 Ops/s $\color{#35bf28}+1.12\%$
test_ddpg_speed[True-backward] 2.0157ms 1.9404ms 515.3612 Ops/s 517.6674 Ops/s $\color{#d91a1a}-0.45\%$
test_ddpg_speed[reduce-overhead-None] 1.1665ms 0.9970ms 1.0030 KOps/s 990.1039 Ops/s $\color{#35bf28}+1.31\%$
test_ddpg_speed[reduce-overhead-backward] 2.0206ms 1.9228ms 520.0781 Ops/s 470.2566 Ops/s $\textbf{\color{#35bf28}+10.59\%}$
test_sac_speed[False-None] 10.1531ms 8.1577ms 122.5841 Ops/s 121.1745 Ops/s $\color{#35bf28}+1.16\%$
test_sac_speed[False-backward] 11.7675ms 11.1265ms 89.8756 Ops/s 90.2761 Ops/s $\color{#d91a1a}-0.44\%$
test_sac_speed[True-None] 2.3688ms 1.8341ms 545.2342 Ops/s 538.0962 Ops/s $\color{#35bf28}+1.33\%$
test_sac_speed[True-backward] 3.7104ms 3.6339ms 275.1830 Ops/s 269.1291 Ops/s $\color{#35bf28}+2.25\%$
test_sac_speed[reduce-overhead-None] 2.2525ms 1.8320ms 545.8605 Ops/s 545.3770 Ops/s $\color{#35bf28}+0.09\%$
test_sac_speed[reduce-overhead-backward] 3.5815ms 3.5399ms 282.4964 Ops/s 280.8222 Ops/s $\color{#35bf28}+0.60\%$
test_redq_speed[False-None] 0.2456s 16.0211ms 62.4177 Ops/s 77.2888 Ops/s $\textbf{\color{#d91a1a}-19.24\%}$
test_redq_speed[False-backward] 23.1180ms 22.1515ms 45.1436 Ops/s 45.0590 Ops/s $\color{#35bf28}+0.19\%$
test_redq_speed[True-None] 6.1743ms 5.5033ms 181.7078 Ops/s 218.4161 Ops/s $\textbf{\color{#d91a1a}-16.81\%}$
test_redq_speed[True-backward] 14.8594ms 13.0214ms 76.7964 Ops/s 84.2914 Ops/s $\textbf{\color{#d91a1a}-8.89\%}$
test_redq_speed[reduce-overhead-None] 6.2821ms 5.2782ms 189.4595 Ops/s 223.0914 Ops/s $\textbf{\color{#d91a1a}-15.08\%}$
test_redq_speed[reduce-overhead-backward] 13.5126ms 13.1200ms 76.2197 Ops/s 85.5009 Ops/s $\textbf{\color{#d91a1a}-10.86\%}$
test_redq_deprec_speed[False-None] 20.5866ms 13.9553ms 71.6571 Ops/s 77.4907 Ops/s $\textbf{\color{#d91a1a}-7.53\%}$
test_redq_deprec_speed[False-backward] 21.6884ms 19.5676ms 51.1050 Ops/s 53.4790 Ops/s $\color{#d91a1a}-4.44\%$
test_redq_deprec_speed[True-None] 4.1899ms 3.6313ms 275.3844 Ops/s 279.4991 Ops/s $\color{#d91a1a}-1.47\%$
test_redq_deprec_speed[True-backward] 9.5485ms 8.6688ms 115.3556 Ops/s 124.2629 Ops/s $\textbf{\color{#d91a1a}-7.17\%}$
test_redq_deprec_speed[reduce-overhead-None] 4.1224ms 3.5840ms 279.0168 Ops/s 273.6408 Ops/s $\color{#35bf28}+1.96\%$
test_redq_deprec_speed[reduce-overhead-backward] 9.5409ms 8.4590ms 118.2169 Ops/s 124.8492 Ops/s $\textbf{\color{#d91a1a}-5.31\%}$
test_td3_speed[False-None] 8.3699ms 8.0866ms 123.6613 Ops/s 124.8125 Ops/s $\color{#d91a1a}-0.92\%$
test_td3_speed[False-backward] 11.0014ms 10.4863ms 95.3628 Ops/s 95.7283 Ops/s $\color{#d91a1a}-0.38\%$
test_td3_speed[True-None] 1.8981ms 1.7013ms 587.7723 Ops/s 581.8127 Ops/s $\color{#35bf28}+1.02\%$
test_td3_speed[True-backward] 3.6901ms 3.3205ms 301.1602 Ops/s 296.5210 Ops/s $\color{#35bf28}+1.56\%$
test_td3_speed[reduce-overhead-None] 1.9003ms 1.7006ms 588.0398 Ops/s 573.8725 Ops/s $\color{#35bf28}+2.47\%$
test_td3_speed[reduce-overhead-backward] 3.6259ms 3.4087ms 293.3647 Ops/s 292.0046 Ops/s $\color{#35bf28}+0.47\%$
test_cql_speed[False-None] 40.4829ms 36.8943ms 27.1044 Ops/s 27.0710 Ops/s $\color{#35bf28}+0.12\%$
test_cql_speed[False-backward] 51.8894ms 48.2470ms 20.7267 Ops/s 20.9528 Ops/s $\color{#d91a1a}-1.08\%$
test_cql_speed[True-None] 16.4697ms 15.3737ms 65.0460 Ops/s 61.8212 Ops/s $\textbf{\color{#35bf28}+5.22\%}$
test_cql_speed[True-backward] 23.5603ms 22.3348ms 44.7731 Ops/s 43.7946 Ops/s $\color{#35bf28}+2.23\%$
test_cql_speed[reduce-overhead-None] 16.6010ms 15.4862ms 64.5735 Ops/s 62.2800 Ops/s $\color{#35bf28}+3.68\%$
test_cql_speed[reduce-overhead-backward] 23.1849ms 22.3928ms 44.6573 Ops/s 42.7153 Ops/s $\color{#35bf28}+4.55\%$
test_a2c_speed[False-None] 8.9365ms 7.1601ms 139.6631 Ops/s 131.6952 Ops/s $\textbf{\color{#35bf28}+6.05\%}$
test_a2c_speed[False-backward] 15.4933ms 14.2242ms 70.3029 Ops/s 66.1687 Ops/s $\textbf{\color{#35bf28}+6.25\%}$
test_a2c_speed[True-None] 5.0323ms 4.1799ms 239.2402 Ops/s 234.6072 Ops/s $\color{#35bf28}+1.97\%$
test_a2c_speed[True-backward] 11.2815ms 10.6678ms 93.7405 Ops/s 91.0916 Ops/s $\color{#35bf28}+2.91\%$
test_a2c_speed[reduce-overhead-None] 4.9074ms 4.1607ms 240.3459 Ops/s 230.5099 Ops/s $\color{#35bf28}+4.27\%$
test_a2c_speed[reduce-overhead-backward] 11.5998ms 10.6141ms 94.2141 Ops/s 91.8296 Ops/s $\color{#35bf28}+2.60\%$
test_ppo_speed[False-None] 8.2712ms 7.3311ms 136.4061 Ops/s 131.1689 Ops/s $\color{#35bf28}+3.99\%$
test_ppo_speed[False-backward] 14.8073ms 14.4717ms 69.1004 Ops/s 66.2304 Ops/s $\color{#35bf28}+4.33\%$
test_ppo_speed[True-None] 4.3406ms 3.6665ms 272.7374 Ops/s 266.4586 Ops/s $\color{#35bf28}+2.36\%$
test_ppo_speed[True-backward] 9.9761ms 9.5402ms 104.8198 Ops/s 101.9450 Ops/s $\color{#35bf28}+2.82\%$
test_ppo_speed[reduce-overhead-None] 4.2764ms 3.6520ms 273.8220 Ops/s 267.5480 Ops/s $\color{#35bf28}+2.35\%$
test_ppo_speed[reduce-overhead-backward] 9.8521ms 9.5298ms 104.9337 Ops/s 99.9430 Ops/s $\color{#35bf28}+4.99\%$
test_reinforce_speed[False-None] 8.0648ms 6.4901ms 154.0814 Ops/s 148.3692 Ops/s $\color{#35bf28}+3.85\%$
test_reinforce_speed[False-backward] 10.8236ms 9.7150ms 102.9333 Ops/s 97.4299 Ops/s $\textbf{\color{#35bf28}+5.65\%}$
test_reinforce_speed[True-None] 2.9391ms 2.6067ms 383.6242 Ops/s 370.5611 Ops/s $\color{#35bf28}+3.53\%$
test_reinforce_speed[True-backward] 8.9302ms 8.5347ms 117.1688 Ops/s 111.1381 Ops/s $\textbf{\color{#35bf28}+5.43\%}$
test_reinforce_speed[reduce-overhead-None] 3.2449ms 2.6443ms 378.1653 Ops/s 365.7059 Ops/s $\color{#35bf28}+3.41\%$
test_reinforce_speed[reduce-overhead-backward] 8.9404ms 8.5205ms 117.3644 Ops/s 112.2345 Ops/s $\color{#35bf28}+4.57\%$
test_iql_speed[False-None] 33.0482ms 31.6775ms 31.5681 Ops/s 30.6526 Ops/s $\color{#35bf28}+2.99\%$
test_iql_speed[False-backward] 46.0993ms 44.6372ms 22.4028 Ops/s 21.6596 Ops/s $\color{#35bf28}+3.43\%$
test_iql_speed[True-None] 11.3519ms 10.5152ms 95.1004 Ops/s 88.2671 Ops/s $\textbf{\color{#35bf28}+7.74\%}$
test_iql_speed[True-backward] 22.1246ms 21.2492ms 47.0606 Ops/s 44.5139 Ops/s $\textbf{\color{#35bf28}+5.72\%}$
test_iql_speed[reduce-overhead-None] 11.6186ms 10.7674ms 92.8728 Ops/s 87.7725 Ops/s $\textbf{\color{#35bf28}+5.81\%}$
test_iql_speed[reduce-overhead-backward] 22.6990ms 21.6282ms 46.2360 Ops/s 42.4142 Ops/s $\textbf{\color{#35bf28}+9.01\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.3109ms 4.9318ms 202.7675 Ops/s 173.5438 Ops/s $\textbf{\color{#35bf28}+16.84\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8560ms 0.5045ms 1.9822 KOps/s 1.8199 KOps/s $\textbf{\color{#35bf28}+8.91\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8422ms 0.4825ms 2.0724 KOps/s 1.9505 KOps/s $\textbf{\color{#35bf28}+6.25\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.2742ms 4.6685ms 214.2012 Ops/s 191.4154 Ops/s $\textbf{\color{#35bf28}+11.90\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.1679ms 0.4952ms 2.0195 KOps/s 1.9201 KOps/s $\textbf{\color{#35bf28}+5.18\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6911ms 0.4709ms 2.1238 KOps/s 1.9758 KOps/s $\textbf{\color{#35bf28}+7.49\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.5566ms 1.6259ms 615.0331 Ops/s 578.5245 Ops/s $\textbf{\color{#35bf28}+6.31\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.1111ms 1.5760ms 634.5364 Ops/s 603.4322 Ops/s $\textbf{\color{#35bf28}+5.15\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.6933ms 4.8690ms 205.3802 Ops/s 182.8117 Ops/s $\textbf{\color{#35bf28}+12.35\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0805ms 0.6391ms 1.5646 KOps/s 1.4857 KOps/s $\textbf{\color{#35bf28}+5.32\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0658ms 0.6128ms 1.6318 KOps/s 1.5531 KOps/s $\textbf{\color{#35bf28}+5.07\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.0861ms 4.8553ms 205.9597 Ops/s 188.9042 Ops/s $\textbf{\color{#35bf28}+9.03\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.5812ms 0.5279ms 1.8944 KOps/s 1.8529 KOps/s $\color{#35bf28}+2.24\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8439ms 0.4927ms 2.0294 KOps/s 1.9092 KOps/s $\textbf{\color{#35bf28}+6.30\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.2535ms 4.6782ms 213.7565 Ops/s 193.2073 Ops/s $\textbf{\color{#35bf28}+10.64\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.0954ms 0.4871ms 2.0529 KOps/s 1.9202 KOps/s $\textbf{\color{#35bf28}+6.91\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7164ms 0.4665ms 2.1439 KOps/s 2.0004 KOps/s $\textbf{\color{#35bf28}+7.17\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.0663ms 4.8960ms 204.2470 Ops/s 179.6328 Ops/s $\textbf{\color{#35bf28}+13.70\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0494ms 0.6404ms 1.5615 KOps/s 1.4542 KOps/s $\textbf{\color{#35bf28}+7.38\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8260ms 0.6127ms 1.6321 KOps/s 1.5389 KOps/s $\textbf{\color{#35bf28}+6.06\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.4351ms 4.1960ms 238.3232 Ops/s 33.1669 Ops/s $\textbf{\color{#35bf28}+618.56\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.6655ms 2.2897ms 436.7433 Ops/s 399.3155 Ops/s $\textbf{\color{#35bf28}+9.37\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.6747ms 1.2185ms 820.6718 Ops/s 693.4474 Ops/s $\textbf{\color{#35bf28}+18.35\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.3989s 12.0775ms 82.7983 Ops/s 205.1134 Ops/s $\textbf{\color{#d91a1a}-59.63\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 3.6022ms 2.1857ms 457.5114 Ops/s 382.1602 Ops/s $\textbf{\color{#35bf28}+19.72\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.5873ms 1.3946ms 717.0262 Ops/s 824.1095 Ops/s $\textbf{\color{#d91a1a}-12.99\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 5.8538ms 4.2961ms 232.7705 Ops/s 211.4505 Ops/s $\textbf{\color{#35bf28}+10.08\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 7.6156ms 2.4718ms 404.5617 Ops/s 378.9866 Ops/s $\textbf{\color{#35bf28}+6.75\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 6.4187ms 1.5699ms 636.9641 Ops/s 671.3845 Ops/s $\textbf{\color{#d91a1a}-5.13\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 11.4425ms 11.2202ms 89.1249 Ops/s 80.9942 Ops/s $\textbf{\color{#35bf28}+10.04\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 16.8966ms 15.1224ms 66.1272 Ops/s 64.2324 Ops/s $\color{#35bf28}+2.95\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 21.0183ms 20.0095ms 49.9762 Ops/s 47.2852 Ops/s $\textbf{\color{#35bf28}+5.69\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.7861ms 15.3872ms 64.9890 Ops/s 62.2335 Ops/s $\color{#35bf28}+4.43\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 20.8692ms 20.0252ms 49.9372 Ops/s 48.1903 Ops/s $\color{#35bf28}+3.63\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.8933ms 16.5612ms 60.3820 Ops/s 59.6811 Ops/s $\color{#35bf28}+1.17\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}9$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7020s 0.7012s 1.4261 Ops/s 1.3692 Ops/s $\color{#35bf28}+4.16\%$
test_transformed 0.9499s 0.9490s 1.0538 Ops/s 1.0431 Ops/s $\color{#35bf28}+1.03\%$
test_serial 2.0800s 2.0718s 0.4827 Ops/s 0.4755 Ops/s $\color{#35bf28}+1.50\%$
test_parallel 1.9000s 1.8894s 0.5293 Ops/s 0.5377 Ops/s $\color{#d91a1a}-1.56\%$
test_step_mdp_speed[True-True-True-True-True] 0.4348ms 37.6645μs 26.5502 KOps/s 26.6378 KOps/s $\color{#d91a1a}-0.33\%$
test_step_mdp_speed[True-True-True-True-False] 0.3893ms 21.8902μs 45.6825 KOps/s 46.0266 KOps/s $\color{#d91a1a}-0.75\%$
test_step_mdp_speed[True-True-True-False-True] 0.1234ms 20.6583μs 48.4067 KOps/s 46.7471 KOps/s $\color{#35bf28}+3.55\%$
test_step_mdp_speed[True-True-True-False-False] 0.3942ms 12.1788μs 82.1101 KOps/s 80.8201 KOps/s $\color{#35bf28}+1.60\%$
test_step_mdp_speed[True-True-False-True-True] 0.4248ms 40.6563μs 24.5964 KOps/s 24.3558 KOps/s $\color{#35bf28}+0.99\%$
test_step_mdp_speed[True-True-False-True-False] 0.4089ms 23.2551μs 43.0013 KOps/s 41.9882 KOps/s $\color{#35bf28}+2.41\%$
test_step_mdp_speed[True-True-False-False-True] 59.8910μs 23.5030μs 42.5478 KOps/s 42.0597 KOps/s $\color{#35bf28}+1.16\%$
test_step_mdp_speed[True-True-False-False-False] 0.4026ms 14.0308μs 71.2718 KOps/s 69.4915 KOps/s $\color{#35bf28}+2.56\%$
test_step_mdp_speed[True-False-True-True-True] 0.4289ms 42.4834μs 23.5386 KOps/s 23.2625 KOps/s $\color{#35bf28}+1.19\%$
test_step_mdp_speed[True-False-True-True-False] 93.0710μs 25.6454μs 38.9934 KOps/s 38.4700 KOps/s $\color{#35bf28}+1.36\%$
test_step_mdp_speed[True-False-True-False-True] 0.4137ms 23.5581μs 42.4483 KOps/s 41.2421 KOps/s $\color{#35bf28}+2.92\%$
test_step_mdp_speed[True-False-True-False-False] 0.3962ms 14.1632μs 70.6057 KOps/s 69.2187 KOps/s $\color{#35bf28}+2.00\%$
test_step_mdp_speed[True-False-False-True-True] 0.4304ms 44.5504μs 22.4465 KOps/s 21.9475 KOps/s $\color{#35bf28}+2.27\%$
test_step_mdp_speed[True-False-False-True-False] 0.1906ms 27.4603μs 36.4163 KOps/s 35.7901 KOps/s $\color{#35bf28}+1.75\%$
test_step_mdp_speed[True-False-False-False-True] 57.0310μs 24.9169μs 40.1334 KOps/s 40.1874 KOps/s $\color{#d91a1a}-0.13\%$
test_step_mdp_speed[True-False-False-False-False] 69.1410μs 15.7938μs 63.3159 KOps/s 62.7845 KOps/s $\color{#35bf28}+0.85\%$
test_step_mdp_speed[False-True-True-True-True] 98.5120μs 42.5673μs 23.4922 KOps/s 23.8509 KOps/s $\color{#d91a1a}-1.50\%$
test_step_mdp_speed[False-True-True-True-False] 58.0610μs 25.4863μs 39.2368 KOps/s 39.7293 KOps/s $\color{#d91a1a}-1.24\%$
test_step_mdp_speed[False-True-True-False-True] 0.1170ms 26.7361μs 37.4026 KOps/s 37.1296 KOps/s $\color{#35bf28}+0.74\%$
test_step_mdp_speed[False-True-True-False-False] 65.8710μs 15.9040μs 62.8774 KOps/s 62.5005 KOps/s $\color{#35bf28}+0.60\%$
test_step_mdp_speed[False-True-False-True-True] 77.2810μs 44.0876μs 22.6821 KOps/s 22.3908 KOps/s $\color{#35bf28}+1.30\%$
test_step_mdp_speed[False-True-False-True-False] 55.2610μs 27.7421μs 36.0462 KOps/s 35.6529 KOps/s $\color{#35bf28}+1.10\%$
test_step_mdp_speed[False-True-False-False-True] 3.3901ms 29.1287μs 34.3304 KOps/s 34.7311 KOps/s $\color{#d91a1a}-1.15\%$
test_step_mdp_speed[False-True-False-False-False] 67.7310μs 18.0068μs 55.5346 KOps/s 55.8625 KOps/s $\color{#d91a1a}-0.59\%$
test_step_mdp_speed[False-False-True-True-True] 0.1816ms 46.7758μs 21.3786 KOps/s 21.3160 KOps/s $\color{#35bf28}+0.29\%$
test_step_mdp_speed[False-False-True-True-False] 82.1510μs 30.5292μs 32.7556 KOps/s 32.9482 KOps/s $\color{#d91a1a}-0.58\%$
test_step_mdp_speed[False-False-True-False-True] 69.5810μs 28.6586μs 34.8936 KOps/s 34.8746 KOps/s $\color{#35bf28}+0.05\%$
test_step_mdp_speed[False-False-True-False-False] 0.1216ms 17.8748μs 55.9448 KOps/s 56.0425 KOps/s $\color{#d91a1a}-0.17\%$
test_step_mdp_speed[False-False-False-True-True] 94.9120μs 48.1091μs 20.7861 KOps/s 20.5493 KOps/s $\color{#35bf28}+1.15\%$
test_step_mdp_speed[False-False-False-True-False] 69.7210μs 32.1636μs 31.0910 KOps/s 30.9504 KOps/s $\color{#35bf28}+0.45\%$
test_step_mdp_speed[False-False-False-False-True] 79.2220μs 30.3687μs 32.9286 KOps/s 33.0931 KOps/s $\color{#d91a1a}-0.50\%$
test_step_mdp_speed[False-False-False-False-False] 46.8600μs 19.7111μs 50.7329 KOps/s 50.3832 KOps/s $\color{#35bf28}+0.69\%$
test_values[generalized_advantage_estimate-True-True] 24.8282ms 24.3074ms 41.1397 Ops/s 40.4663 Ops/s $\color{#35bf28}+1.66\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1019s 2.9310ms 341.1763 Ops/s 338.1064 Ops/s $\color{#35bf28}+0.91\%$
test_values[td0_return_estimate-False-False] 0.1035ms 80.0113μs 12.4982 KOps/s 12.5254 KOps/s $\color{#d91a1a}-0.22\%$
test_values[td1_return_estimate-False-False] 54.8636ms 54.3459ms 18.4007 Ops/s 18.1954 Ops/s $\color{#35bf28}+1.13\%$
test_values[vec_td1_return_estimate-False-False] 1.3463ms 1.0765ms 928.8997 Ops/s 920.3229 Ops/s $\color{#35bf28}+0.93\%$
test_values[td_lambda_return_estimate-True-False] 87.3697ms 86.4121ms 11.5724 Ops/s 11.3545 Ops/s $\color{#35bf28}+1.92\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.4372ms 1.0766ms 928.8120 Ops/s 927.0809 Ops/s $\color{#35bf28}+0.19\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.5415ms 23.9939ms 41.6773 Ops/s 40.0413 Ops/s $\color{#35bf28}+4.09\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0929ms 0.7688ms 1.3007 KOps/s 1.3053 KOps/s $\color{#d91a1a}-0.36\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8044ms 0.6678ms 1.4975 KOps/s 1.4564 KOps/s $\color{#35bf28}+2.82\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6407ms 1.4814ms 675.0455 Ops/s 674.4636 Ops/s $\color{#35bf28}+0.09\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8415ms 0.6819ms 1.4665 KOps/s 1.4282 KOps/s $\color{#35bf28}+2.69\%$
test_dqn_speed[False-None] 7.3734ms 1.5065ms 663.8058 Ops/s 668.8246 Ops/s $\color{#d91a1a}-0.75\%$
test_dqn_speed[False-backward] 2.2309ms 2.1184ms 472.0637 Ops/s 465.5778 Ops/s $\color{#35bf28}+1.39\%$
test_dqn_speed[True-None] 0.6981ms 0.5265ms 1.8995 KOps/s 1.8699 KOps/s $\color{#35bf28}+1.58\%$
test_dqn_speed[True-backward] 1.1533ms 1.0820ms 924.2269 Ops/s 903.7024 Ops/s $\color{#35bf28}+2.27\%$
test_dqn_speed[reduce-overhead-None] 0.6961ms 0.5370ms 1.8621 KOps/s 1.8275 KOps/s $\color{#35bf28}+1.89\%$
test_dqn_speed[reduce-overhead-backward] 1.0969ms 0.9473ms 1.0556 KOps/s 1.0323 KOps/s $\color{#35bf28}+2.27\%$
test_ddpg_speed[False-None] 3.0945ms 2.7856ms 358.9947 Ops/s 349.8693 Ops/s $\color{#35bf28}+2.61\%$
test_ddpg_speed[False-backward] 4.5653ms 4.1068ms 243.4969 Ops/s 241.9793 Ops/s $\color{#35bf28}+0.63\%$
test_ddpg_speed[True-None] 1.2520ms 1.0483ms 953.9352 Ops/s 935.3712 Ops/s $\color{#35bf28}+1.98\%$
test_ddpg_speed[True-backward] 2.2779ms 2.1188ms 471.9658 Ops/s 464.0198 Ops/s $\color{#35bf28}+1.71\%$
test_ddpg_speed[reduce-overhead-None] 1.2451ms 1.0557ms 947.1954 Ops/s 931.0152 Ops/s $\color{#35bf28}+1.74\%$
test_ddpg_speed[reduce-overhead-backward] 1.7786ms 1.6043ms 623.3379 Ops/s 612.2192 Ops/s $\color{#35bf28}+1.82\%$
test_sac_speed[False-None] 8.5100ms 8.0400ms 124.3780 Ops/s 122.5968 Ops/s $\color{#35bf28}+1.45\%$
test_sac_speed[False-backward] 11.6305ms 10.8828ms 91.8880 Ops/s 89.1172 Ops/s $\color{#35bf28}+3.11\%$
test_sac_speed[True-None] 1.6481ms 1.4822ms 674.6814 Ops/s 651.4276 Ops/s $\color{#35bf28}+3.57\%$
test_sac_speed[True-backward] 3.3638ms 3.2434ms 308.3152 Ops/s 294.9766 Ops/s $\color{#35bf28}+4.52\%$
test_sac_speed[reduce-overhead-None] 22.4014ms 12.3543ms 80.9433 Ops/s 81.6139 Ops/s $\color{#d91a1a}-0.82\%$
test_sac_speed[reduce-overhead-backward] 1.4780ms 1.3594ms 735.6134 Ops/s 664.0044 Ops/s $\textbf{\color{#35bf28}+10.78\%}$
test_redq_speed[False-None] 8.1612ms 7.4281ms 134.6240 Ops/s 130.8586 Ops/s $\color{#35bf28}+2.88\%$
test_redq_speed[False-backward] 11.9834ms 11.2992ms 88.5022 Ops/s 84.0408 Ops/s $\textbf{\color{#35bf28}+5.31\%}$
test_redq_speed[True-None] 2.2019ms 1.9695ms 507.7383 Ops/s 507.7022 Ops/s $+0.01\%$
test_redq_speed[True-backward] 4.0886ms 3.6501ms 273.9641 Ops/s 275.2902 Ops/s $\color{#d91a1a}-0.48\%$
test_redq_speed[reduce-overhead-None] 2.1789ms 1.9658ms 508.7111 Ops/s 493.0061 Ops/s $\color{#35bf28}+3.19\%$
test_redq_speed[reduce-overhead-backward] 3.7124ms 3.5975ms 277.9737 Ops/s 277.3884 Ops/s $\color{#35bf28}+0.21\%$
test_redq_deprec_speed[False-None] 10.0930ms 9.0222ms 110.8381 Ops/s 110.3384 Ops/s $\color{#35bf28}+0.45\%$
test_redq_deprec_speed[False-backward] 12.5167ms 12.0100ms 83.2638 Ops/s 82.2349 Ops/s $\color{#35bf28}+1.25\%$
test_redq_deprec_speed[True-None] 2.6387ms 2.3141ms 432.1421 Ops/s 412.6434 Ops/s $\color{#35bf28}+4.73\%$
test_redq_deprec_speed[True-backward] 4.5019ms 4.1601ms 240.3777 Ops/s 239.7360 Ops/s $\color{#35bf28}+0.27\%$
test_redq_deprec_speed[reduce-overhead-None] 2.4623ms 2.2688ms 440.7677 Ops/s 430.1824 Ops/s $\color{#35bf28}+2.46\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.4175ms 4.1331ms 241.9504 Ops/s 240.7488 Ops/s $\color{#35bf28}+0.50\%$
test_td3_speed[False-None] 35.9435ms 8.0428ms 124.3348 Ops/s 124.8060 Ops/s $\color{#d91a1a}-0.38\%$
test_td3_speed[False-backward] 10.7905ms 10.4096ms 96.0652 Ops/s 94.1410 Ops/s $\color{#35bf28}+2.04\%$
test_td3_speed[True-None] 1.5655ms 1.5248ms 655.8128 Ops/s 641.2664 Ops/s $\color{#35bf28}+2.27\%$
test_td3_speed[True-backward] 3.6903ms 3.3243ms 300.8186 Ops/s 308.9255 Ops/s $\color{#d91a1a}-2.62\%$
test_td3_speed[reduce-overhead-None] 48.8373ms 24.7868ms 40.3440 Ops/s 37.8430 Ops/s $\textbf{\color{#35bf28}+6.61\%}$
test_td3_speed[reduce-overhead-backward] 1.6505ms 1.4623ms 683.8446 Ops/s 683.6553 Ops/s $\color{#35bf28}+0.03\%$
test_cql_speed[False-None] 17.2858ms 16.7270ms 59.7837 Ops/s 59.3286 Ops/s $\color{#35bf28}+0.77\%$
test_cql_speed[False-backward] 22.8162ms 22.3000ms 44.8430 Ops/s 44.4965 Ops/s $\color{#35bf28}+0.78\%$
test_cql_speed[True-None] 3.0419ms 2.8538ms 350.4147 Ops/s 342.1638 Ops/s $\color{#35bf28}+2.41\%$
test_cql_speed[True-backward] 5.3837ms 5.0414ms 198.3560 Ops/s 191.6527 Ops/s $\color{#35bf28}+3.50\%$
test_cql_speed[reduce-overhead-None] 21.5569ms 12.9676ms 77.1154 Ops/s 79.5346 Ops/s $\color{#d91a1a}-3.04\%$
test_cql_speed[reduce-overhead-backward] 1.8517ms 1.6731ms 597.7032 Ops/s 591.3359 Ops/s $\color{#35bf28}+1.08\%$
test_a2c_speed[False-None] 3.5803ms 3.1985ms 312.6453 Ops/s 310.9355 Ops/s $\color{#35bf28}+0.55\%$
test_a2c_speed[False-backward] 7.0324ms 6.4286ms 155.5558 Ops/s 153.5542 Ops/s $\color{#35bf28}+1.30\%$
test_a2c_speed[True-None] 1.1798ms 0.9990ms 1.0010 KOps/s 981.4571 Ops/s $\color{#35bf28}+1.99\%$
test_a2c_speed[True-backward] 3.1916ms 2.7196ms 367.6977 Ops/s 359.5182 Ops/s $\color{#35bf28}+2.28\%$
test_a2c_speed[reduce-overhead-None] 21.9641ms 11.5097ms 86.8830 Ops/s 88.9597 Ops/s $\color{#d91a1a}-2.33\%$
test_a2c_speed[reduce-overhead-backward] 1.3037ms 1.1091ms 901.6062 Ops/s 867.3301 Ops/s $\color{#35bf28}+3.95\%$
test_ppo_speed[False-None] 3.9776ms 3.7009ms 270.2075 Ops/s 267.0529 Ops/s $\color{#35bf28}+1.18\%$
test_ppo_speed[False-backward] 7.6038ms 7.1604ms 139.6574 Ops/s 139.4361 Ops/s $\color{#35bf28}+0.16\%$
test_ppo_speed[True-None] 1.1161ms 0.9560ms 1.0461 KOps/s 1.0801 KOps/s $\color{#d91a1a}-3.15\%$
test_ppo_speed[True-backward] 2.7534ms 2.6404ms 378.7254 Ops/s 368.3901 Ops/s $\color{#35bf28}+2.81\%$
test_ppo_speed[reduce-overhead-None] 0.6664ms 0.5071ms 1.9722 KOps/s 1.9496 KOps/s $\color{#35bf28}+1.16\%$
test_ppo_speed[reduce-overhead-backward] 1.2090ms 1.1102ms 900.7745 Ops/s 860.2977 Ops/s $\color{#35bf28}+4.70\%$
test_reinforce_speed[False-None] 2.5385ms 2.3009ms 434.6195 Ops/s 435.6554 Ops/s $\color{#d91a1a}-0.24\%$
test_reinforce_speed[False-backward] 3.8097ms 3.3922ms 294.7955 Ops/s 291.2294 Ops/s $\color{#35bf28}+1.22\%$
test_reinforce_speed[True-None] 1.0105ms 0.8117ms 1.2319 KOps/s 1.2193 KOps/s $\color{#35bf28}+1.04\%$
test_reinforce_speed[True-backward] 2.6864ms 2.5500ms 392.1533 Ops/s 390.4891 Ops/s $\color{#35bf28}+0.43\%$
test_reinforce_speed[reduce-overhead-None] 22.6980ms 11.5519ms 86.5659 Ops/s 90.9048 Ops/s $\color{#d91a1a}-4.77\%$
test_reinforce_speed[reduce-overhead-backward] 1.2324ms 1.1162ms 895.9357 Ops/s 830.8827 Ops/s $\textbf{\color{#35bf28}+7.83\%}$
test_iql_speed[False-None] 9.5701ms 9.0809ms 110.1215 Ops/s 109.1516 Ops/s $\color{#35bf28}+0.89\%$
test_iql_speed[False-backward] 13.5717ms 12.8886ms 77.5879 Ops/s 75.8598 Ops/s $\color{#35bf28}+2.28\%$
test_iql_speed[True-None] 1.9260ms 1.7557ms 569.5681 Ops/s 588.6465 Ops/s $\color{#d91a1a}-3.24\%$
test_iql_speed[True-backward] 4.6212ms 4.1737ms 239.5936 Ops/s 233.0181 Ops/s $\color{#35bf28}+2.82\%$
test_iql_speed[reduce-overhead-None] 19.8852ms 11.3882ms 87.8101 Ops/s 89.2778 Ops/s $\color{#d91a1a}-1.64\%$
test_iql_speed[reduce-overhead-backward] 1.5763ms 1.4173ms 705.5437 Ops/s 700.9076 Ops/s $\color{#35bf28}+0.66\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.8840ms 6.2789ms 159.2635 Ops/s 159.1475 Ops/s $\color{#35bf28}+0.07\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6698ms 0.3190ms 3.1349 KOps/s 3.4572 KOps/s $\textbf{\color{#d91a1a}-9.32\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6901ms 0.3578ms 2.7947 KOps/s 3.7786 KOps/s $\textbf{\color{#d91a1a}-26.04\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.5844ms 6.0508ms 165.2670 Ops/s 165.1019 Ops/s $\color{#35bf28}+0.10\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.3425ms 0.3163ms 3.1614 KOps/s 3.7185 KOps/s $\textbf{\color{#d91a1a}-14.98\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6500ms 0.3274ms 3.0544 KOps/s 2.8853 KOps/s $\textbf{\color{#35bf28}+5.86\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.7912ms 1.4676ms 681.3907 Ops/s 726.1006 Ops/s $\textbf{\color{#d91a1a}-6.16\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.6398ms 1.3831ms 723.0053 Ops/s 798.1683 Ops/s $\textbf{\color{#d91a1a}-9.42\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4984ms 6.2354ms 160.3737 Ops/s 162.1310 Ops/s $\color{#d91a1a}-1.08\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0783ms 0.4794ms 2.0858 KOps/s 2.0282 KOps/s $\color{#35bf28}+2.84\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8744ms 0.4776ms 2.0940 KOps/s 2.4818 KOps/s $\textbf{\color{#d91a1a}-15.63\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.3177ms 6.0772ms 164.5501 Ops/s 165.2972 Ops/s $\color{#d91a1a}-0.45\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.7025ms 0.2926ms 3.4179 KOps/s 3.0235 KOps/s $\textbf{\color{#35bf28}+13.04\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6282ms 0.2952ms 3.3876 KOps/s 3.4949 KOps/s $\color{#d91a1a}-3.07\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4798ms 6.0666ms 164.8360 Ops/s 167.4608 Ops/s $\color{#d91a1a}-1.57\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7103ms 0.3487ms 2.8677 KOps/s 2.8821 KOps/s $\color{#d91a1a}-0.50\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6200ms 0.2947ms 3.3936 KOps/s 3.1666 KOps/s $\textbf{\color{#35bf28}+7.17\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.3996ms 6.1841ms 161.7038 Ops/s 162.8902 Ops/s $\color{#d91a1a}-0.73\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0843ms 0.4720ms 2.1185 KOps/s 2.1332 KOps/s $\color{#d91a1a}-0.69\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7908ms 0.4783ms 2.0907 KOps/s 2.1560 KOps/s $\color{#d91a1a}-3.03\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.1369ms 5.3572ms 186.6651 Ops/s 185.3745 Ops/s $\color{#35bf28}+0.70\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.9511ms 1.9700ms 507.6182 Ops/s 512.6414 Ops/s $\color{#d91a1a}-0.98\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 8.8371ms 1.3006ms 768.8579 Ops/s 829.6979 Ops/s $\textbf{\color{#d91a1a}-7.33\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.9370ms 5.3820ms 185.8051 Ops/s 186.1077 Ops/s $\color{#d91a1a}-0.16\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.1713ms 2.0708ms 482.9057 Ops/s 438.0181 Ops/s $\textbf{\color{#35bf28}+10.25\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.5710ms 1.2847ms 778.4206 Ops/s 852.2551 Ops/s $\textbf{\color{#d91a1a}-8.66\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5436s 16.3319ms 61.2298 Ops/s 32.2736 Ops/s $\textbf{\color{#35bf28}+89.72\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 10.1900ms 2.2578ms 442.9137 Ops/s 425.3916 Ops/s $\color{#35bf28}+4.12\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.2801ms 1.4100ms 709.2439 Ops/s 782.6180 Ops/s $\textbf{\color{#d91a1a}-9.38\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.9961ms 13.3158ms 75.0990 Ops/s 75.9471 Ops/s $\color{#d91a1a}-1.12\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.6982ms 17.3503ms 57.6358 Ops/s 55.8969 Ops/s $\color{#35bf28}+3.11\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 19.0246ms 18.1242ms 55.1750 Ops/s 54.0049 Ops/s $\color{#35bf28}+2.17\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.9624ms 17.8948ms 55.8820 Ops/s 56.4677 Ops/s $\color{#d91a1a}-1.04\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 18.3421ms 18.0303ms 55.4622 Ops/s 54.5669 Ops/s $\color{#35bf28}+1.64\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 20.8798ms 19.0762ms 52.4213 Ops/s 51.6892 Ops/s $\color{#35bf28}+1.42\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants