Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Quality] IMPALA auto-device #2654

Merged
merged 25 commits into from
Dec 16, 2024
Merged

[Quality] IMPALA auto-device #2654

merged 25 commits into from
Dec 16, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 15, 2024

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Dec 15, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2654

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (8 Unrelated Failures)

As of commit 5159272 with merge base f5a187d (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 15, 2024
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: a4bb57f503aed02d89193c852cc31907878ef0f9
Pull Request resolved: #2654
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: 822b91506a45d4327c8450b4a845c3fd5276247a
Pull Request resolved: #2654
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: 6a9288aa79635d7aed0b0ae8e6f4e38ca2038163
Pull Request resolved: #2654
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: 76f785e4019373b1ea4deb80273a70b5d3e0de05
Pull Request resolved: #2654
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: 29901a53084261880ff993dc83318a6023e9d4f6
Pull Request resolved: #2654
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: 05e59ba466e3db83a824dc9e7c215581c1677ca5
Pull Request resolved: #2654
Copy link

github-actions bot commented Dec 15, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}15$. Worsened: $\large\color{#d91a1a}5$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4336s 0.4279s 2.3368 Ops/s 2.1574 Ops/s $\textbf{\color{#35bf28}+8.32\%}$
test_transformed 0.6123s 0.6039s 1.6560 Ops/s 1.5412 Ops/s $\textbf{\color{#35bf28}+7.45\%}$
test_serial 1.3359s 1.3309s 0.7514 Ops/s 0.7200 Ops/s $\color{#35bf28}+4.36\%$
test_parallel 1.4162s 1.3384s 0.7472 Ops/s 0.7435 Ops/s $\color{#35bf28}+0.49\%$
test_step_mdp_speed[True-True-True-True-True] 0.3619ms 29.6040μs 33.7793 KOps/s 33.3208 KOps/s $\color{#35bf28}+1.38\%$
test_step_mdp_speed[True-True-True-True-False] 46.9690μs 17.8474μs 56.0305 KOps/s 56.5011 KOps/s $\color{#d91a1a}-0.83\%$
test_step_mdp_speed[True-True-True-False-True] 51.3070μs 16.7107μs 59.8419 KOps/s 58.5428 KOps/s $\color{#35bf28}+2.22\%$
test_step_mdp_speed[True-True-True-False-False] 38.7330μs 9.9405μs 100.5981 KOps/s 100.0359 KOps/s $\color{#35bf28}+0.56\%$
test_step_mdp_speed[True-True-False-True-True] 93.5280μs 31.8620μs 31.3854 KOps/s 30.7251 KOps/s $\color{#35bf28}+2.15\%$
test_step_mdp_speed[True-True-False-True-False] 61.3260μs 19.6266μs 50.9512 KOps/s 51.3351 KOps/s $\color{#d91a1a}-0.75\%$
test_step_mdp_speed[True-True-False-False-True] 72.1160μs 18.6151μs 53.7199 KOps/s 53.2928 KOps/s $\color{#35bf28}+0.80\%$
test_step_mdp_speed[True-True-False-False-False] 40.1760μs 11.6965μs 85.4956 KOps/s 83.9094 KOps/s $\color{#35bf28}+1.89\%$
test_step_mdp_speed[True-False-True-True-True] 88.1760μs 33.6788μs 29.6923 KOps/s 29.4423 KOps/s $\color{#35bf28}+0.85\%$
test_step_mdp_speed[True-False-True-True-False] 0.7075ms 21.5318μs 46.4430 KOps/s 45.9776 KOps/s $\color{#35bf28}+1.01\%$
test_step_mdp_speed[True-False-True-False-True] 68.8600μs 18.9397μs 52.7991 KOps/s 52.9319 KOps/s $\color{#d91a1a}-0.25\%$
test_step_mdp_speed[True-False-True-False-False] 57.5590μs 11.6767μs 85.6404 KOps/s 83.9732 KOps/s $\color{#35bf28}+1.99\%$
test_step_mdp_speed[True-False-False-True-True] 77.3460μs 35.3464μs 28.2914 KOps/s 28.3813 KOps/s $\color{#d91a1a}-0.32\%$
test_step_mdp_speed[True-False-False-True-False] 76.1040μs 22.8435μs 43.7761 KOps/s 43.3843 KOps/s $\color{#35bf28}+0.90\%$
test_step_mdp_speed[True-False-False-False-True] 55.0740μs 20.3297μs 49.1892 KOps/s 49.1176 KOps/s $\color{#35bf28}+0.15\%$
test_step_mdp_speed[True-False-False-False-False] 69.4110μs 13.5577μs 73.7591 KOps/s 74.6254 KOps/s $\color{#d91a1a}-1.16\%$
test_step_mdp_speed[False-True-True-True-True] 77.3470μs 33.9525μs 29.4529 KOps/s 29.5021 KOps/s $\color{#d91a1a}-0.17\%$
test_step_mdp_speed[False-True-True-True-False] 73.9100μs 21.1960μs 47.1787 KOps/s 45.9313 KOps/s $\color{#35bf28}+2.72\%$
test_step_mdp_speed[False-True-True-False-True] 73.8200μs 20.9934μs 47.6340 KOps/s 46.6236 KOps/s $\color{#35bf28}+2.17\%$
test_step_mdp_speed[False-True-True-False-False] 56.6370μs 12.9696μs 77.1034 KOps/s 76.5123 KOps/s $\color{#35bf28}+0.77\%$
test_step_mdp_speed[False-True-False-True-True] 83.1970μs 35.4079μs 28.2423 KOps/s 28.1102 KOps/s $\color{#35bf28}+0.47\%$
test_step_mdp_speed[False-True-False-True-False] 68.6790μs 23.2840μs 42.9480 KOps/s 42.9477 KOps/s $+0.00\%$
test_step_mdp_speed[False-True-False-False-True] 2.8068ms 23.4426μs 42.6573 KOps/s 42.7020 KOps/s $\color{#d91a1a}-0.10\%$
test_step_mdp_speed[False-True-False-False-False] 46.5580μs 14.7650μs 67.7276 KOps/s 66.9608 KOps/s $\color{#35bf28}+1.15\%$
test_step_mdp_speed[False-False-True-True-True] 98.1350μs 37.2244μs 26.8641 KOps/s 26.5922 KOps/s $\color{#35bf28}+1.02\%$
test_step_mdp_speed[False-False-True-True-False] 73.3890μs 25.0171μs 39.9726 KOps/s 39.7470 KOps/s $\color{#35bf28}+0.57\%$
test_step_mdp_speed[False-False-True-False-True] 68.3290μs 22.9336μs 43.6041 KOps/s 43.4546 KOps/s $\color{#35bf28}+0.34\%$
test_step_mdp_speed[False-False-True-False-False] 0.5167ms 14.7594μs 67.7536 KOps/s 68.0025 KOps/s $\color{#d91a1a}-0.37\%$
test_step_mdp_speed[False-False-False-True-True] 70.8240μs 37.8042μs 26.4521 KOps/s 26.1647 KOps/s $\color{#35bf28}+1.10\%$
test_step_mdp_speed[False-False-False-True-False] 81.5250μs 26.0968μs 38.3189 KOps/s 37.4583 KOps/s $\color{#35bf28}+2.30\%$
test_step_mdp_speed[False-False-False-False-True] 75.3240μs 23.9610μs 41.7346 KOps/s 41.0955 KOps/s $\color{#35bf28}+1.56\%$
test_step_mdp_speed[False-False-False-False-False] 50.7150μs 16.2479μs 61.5464 KOps/s 60.6954 KOps/s $\color{#35bf28}+1.40\%$
test_values[generalized_advantage_estimate-True-True] 10.0226ms 9.5845ms 104.3351 Ops/s 102.4907 Ops/s $\color{#35bf28}+1.80\%$
test_values[vec_generalized_advantage_estimate-True-True] 36.7295ms 33.9038ms 29.4952 Ops/s 29.4292 Ops/s $\color{#35bf28}+0.22\%$
test_values[td0_return_estimate-False-False] 0.2902ms 0.1990ms 5.0248 KOps/s 5.2676 KOps/s $\color{#d91a1a}-4.61\%$
test_values[td1_return_estimate-False-False] 24.7383ms 24.1957ms 41.3297 Ops/s 41.3925 Ops/s $\color{#d91a1a}-0.15\%$
test_values[vec_td1_return_estimate-False-False] 36.3030ms 34.0342ms 29.3822 Ops/s 28.6759 Ops/s $\color{#35bf28}+2.46\%$
test_values[td_lambda_return_estimate-True-False] 38.6412ms 35.5714ms 28.1124 Ops/s 27.9791 Ops/s $\color{#35bf28}+0.48\%$
test_values[vec_td_lambda_return_estimate-True-False] 36.3035ms 33.9556ms 29.4502 Ops/s 28.7372 Ops/s $\color{#35bf28}+2.48\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 11.8259ms 8.2372ms 121.4001 Ops/s 118.2398 Ops/s $\color{#35bf28}+2.67\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.4539ms 2.0131ms 496.7580 Ops/s 533.7342 Ops/s $\textbf{\color{#d91a1a}-6.93\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5257ms 0.3620ms 2.7627 KOps/s 2.7261 KOps/s $\color{#35bf28}+1.34\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 46.7542ms 45.3981ms 22.0274 Ops/s 22.1059 Ops/s $\color{#d91a1a}-0.36\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.0224ms 3.0705ms 325.6839 Ops/s 318.3978 Ops/s $\color{#35bf28}+2.29\%$
test_dqn_speed[False-None] 8.7629ms 1.4352ms 696.7878 Ops/s 692.7339 Ops/s $\color{#35bf28}+0.59\%$
test_dqn_speed[False-backward] 2.2405ms 1.9539ms 511.7899 Ops/s 520.3477 Ops/s $\color{#d91a1a}-1.64\%$
test_dqn_speed[True-None] 0.7662ms 0.4632ms 2.1589 KOps/s 2.0172 KOps/s $\textbf{\color{#35bf28}+7.02\%}$
test_dqn_speed[True-backward] 1.0574ms 0.9461ms 1.0569 KOps/s 1.0614 KOps/s $\color{#d91a1a}-0.43\%$
test_dqn_speed[reduce-overhead-None] 0.7243ms 0.4712ms 2.1222 KOps/s 2.0786 KOps/s $\color{#35bf28}+2.10\%$
test_dqn_speed[reduce-overhead-backward] 1.0199ms 0.9184ms 1.0888 KOps/s 1.0788 KOps/s $\color{#35bf28}+0.93\%$
test_ddpg_speed[False-None] 3.8182ms 2.9275ms 341.5915 Ops/s 336.8796 Ops/s $\color{#35bf28}+1.40\%$
test_ddpg_speed[False-backward] 4.3347ms 4.0621ms 246.1794 Ops/s 243.6943 Ops/s $\color{#35bf28}+1.02\%$
test_ddpg_speed[True-None] 1.5088ms 1.0112ms 988.9162 Ops/s 961.9198 Ops/s $\color{#35bf28}+2.81\%$
test_ddpg_speed[True-backward] 2.3631ms 1.9973ms 500.6719 Ops/s 510.7200 Ops/s $\color{#d91a1a}-1.97\%$
test_ddpg_speed[reduce-overhead-None] 1.3987ms 1.0073ms 992.7144 Ops/s 977.7421 Ops/s $\color{#35bf28}+1.53\%$
test_ddpg_speed[reduce-overhead-backward] 2.3010ms 2.0518ms 487.3793 Ops/s 498.4195 Ops/s $\color{#d91a1a}-2.22\%$
test_sac_speed[False-None] 10.3913ms 8.3530ms 119.7178 Ops/s 116.3676 Ops/s $\color{#35bf28}+2.88\%$
test_sac_speed[False-backward] 14.7475ms 11.3828ms 87.8522 Ops/s 86.2359 Ops/s $\color{#35bf28}+1.87\%$
test_sac_speed[True-None] 2.4584ms 1.8436ms 542.4253 Ops/s 535.3564 Ops/s $\color{#35bf28}+1.32\%$
test_sac_speed[True-backward] 5.6741ms 3.8711ms 258.3277 Ops/s 233.7780 Ops/s $\textbf{\color{#35bf28}+10.50\%}$
test_sac_speed[reduce-overhead-None] 2.4845ms 1.8530ms 539.6757 Ops/s 537.4659 Ops/s $\color{#35bf28}+0.41\%$
test_sac_speed[reduce-overhead-backward] 4.0964ms 3.5693ms 280.1694 Ops/s 271.8601 Ops/s $\color{#35bf28}+3.06\%$
test_redq_speed[False-None] 15.4396ms 13.5961ms 73.5504 Ops/s 74.1074 Ops/s $\color{#d91a1a}-0.75\%$
test_redq_speed[False-backward] 24.8266ms 22.8506ms 43.7626 Ops/s 43.3286 Ops/s $\color{#35bf28}+1.00\%$
test_redq_speed[True-None] 6.4365ms 5.3757ms 186.0227 Ops/s 188.7603 Ops/s $\color{#d91a1a}-1.45\%$
test_redq_speed[True-backward] 20.9230ms 13.2307ms 75.5818 Ops/s 76.5443 Ops/s $\color{#d91a1a}-1.26\%$
test_redq_speed[reduce-overhead-None] 6.4467ms 5.2679ms 189.8272 Ops/s 171.5979 Ops/s $\textbf{\color{#35bf28}+10.62\%}$
test_redq_speed[reduce-overhead-backward] 14.6665ms 13.0546ms 76.6014 Ops/s 76.8955 Ops/s $\color{#d91a1a}-0.38\%$
test_redq_deprec_speed[False-None] 18.7598ms 14.9076ms 67.0797 Ops/s 68.5241 Ops/s $\color{#d91a1a}-2.11\%$
test_redq_deprec_speed[False-backward] 27.1332ms 21.3476ms 46.8437 Ops/s 48.8146 Ops/s $\color{#d91a1a}-4.04\%$
test_redq_deprec_speed[True-None] 4.8308ms 3.9436ms 253.5746 Ops/s 243.2638 Ops/s $\color{#35bf28}+4.24\%$
test_redq_deprec_speed[True-backward] 9.6570ms 8.8663ms 112.7863 Ops/s 103.1462 Ops/s $\textbf{\color{#35bf28}+9.35\%}$
test_redq_deprec_speed[reduce-overhead-None] 5.0663ms 3.9352ms 254.1197 Ops/s 262.0724 Ops/s $\color{#d91a1a}-3.03\%$
test_redq_deprec_speed[reduce-overhead-backward] 10.4637ms 8.5890ms 116.4285 Ops/s 112.8225 Ops/s $\color{#35bf28}+3.20\%$
test_td3_speed[False-None] 34.9290ms 8.6368ms 115.7838 Ops/s 115.5934 Ops/s $\color{#35bf28}+0.16\%$
test_td3_speed[False-backward] 13.4717ms 11.0967ms 90.1168 Ops/s 89.5260 Ops/s $\color{#35bf28}+0.66\%$
test_td3_speed[True-None] 2.3110ms 1.7345ms 576.5324 Ops/s 563.3161 Ops/s $\color{#35bf28}+2.35\%$
test_td3_speed[True-backward] 4.6578ms 3.7261ms 268.3802 Ops/s 274.5349 Ops/s $\color{#d91a1a}-2.24\%$
test_td3_speed[reduce-overhead-None] 2.2853ms 1.7470ms 572.3982 Ops/s 565.5195 Ops/s $\color{#35bf28}+1.22\%$
test_td3_speed[reduce-overhead-backward] 4.8466ms 3.6072ms 277.2237 Ops/s 271.5692 Ops/s $\color{#35bf28}+2.08\%$
test_cql_speed[False-None] 40.7038ms 37.9150ms 26.3748 Ops/s 26.4923 Ops/s $\color{#d91a1a}-0.44\%$
test_cql_speed[False-backward] 66.4480ms 49.4602ms 20.2183 Ops/s 20.4351 Ops/s $\color{#d91a1a}-1.06\%$
test_cql_speed[True-None] 18.0864ms 15.9884ms 62.5452 Ops/s 61.9015 Ops/s $\color{#35bf28}+1.04\%$
test_cql_speed[True-backward] 26.9992ms 23.9485ms 41.7563 Ops/s 42.6895 Ops/s $\color{#d91a1a}-2.19\%$
test_cql_speed[reduce-overhead-None] 17.7897ms 16.3836ms 61.0368 Ops/s 61.0415 Ops/s $-0.01\%$
test_cql_speed[reduce-overhead-backward] 27.2048ms 24.4914ms 40.8307 Ops/s 42.3709 Ops/s $\color{#d91a1a}-3.64\%$
test_a2c_speed[False-None] 8.4729ms 7.6037ms 131.5158 Ops/s 131.5147 Ops/s $+0.00\%$
test_a2c_speed[False-backward] 16.1188ms 15.3633ms 65.0903 Ops/s 63.6003 Ops/s $\color{#35bf28}+2.34\%$
test_a2c_speed[True-None] 4.7609ms 4.3145ms 231.7746 Ops/s 223.2376 Ops/s $\color{#35bf28}+3.82\%$
test_a2c_speed[True-backward] 12.4112ms 11.3021ms 88.4791 Ops/s 83.7189 Ops/s $\textbf{\color{#35bf28}+5.69\%}$
test_a2c_speed[reduce-overhead-None] 6.6906ms 4.8400ms 206.6097 Ops/s 223.9472 Ops/s $\textbf{\color{#d91a1a}-7.74\%}$
test_a2c_speed[reduce-overhead-backward] 12.3600ms 11.0141ms 90.7931 Ops/s 86.5836 Ops/s $\color{#35bf28}+4.86\%$
test_ppo_speed[False-None] 11.4972ms 7.9185ms 126.2858 Ops/s 125.0460 Ops/s $\color{#35bf28}+0.99\%$
test_ppo_speed[False-backward] 18.0477ms 15.2748ms 65.4675 Ops/s 63.3381 Ops/s $\color{#35bf28}+3.36\%$
test_ppo_speed[True-None] 5.2394ms 3.9026ms 256.2414 Ops/s 257.1197 Ops/s $\color{#d91a1a}-0.34\%$
test_ppo_speed[True-backward] 11.7393ms 9.9353ms 100.6513 Ops/s 100.5153 Ops/s $\color{#35bf28}+0.14\%$
test_ppo_speed[reduce-overhead-None] 4.5976ms 3.8260ms 261.3673 Ops/s 257.1479 Ops/s $\color{#35bf28}+1.64\%$
test_ppo_speed[reduce-overhead-backward] 10.8080ms 9.8921ms 101.0905 Ops/s 98.5285 Ops/s $\color{#35bf28}+2.60\%$
test_reinforce_speed[False-None] 8.3034ms 6.9063ms 144.7947 Ops/s 144.6107 Ops/s $\color{#35bf28}+0.13\%$
test_reinforce_speed[False-backward] 11.5101ms 10.2716ms 97.3555 Ops/s 94.8532 Ops/s $\color{#35bf28}+2.64\%$
test_reinforce_speed[True-None] 3.4723ms 2.7904ms 358.3652 Ops/s 353.2093 Ops/s $\color{#35bf28}+1.46\%$
test_reinforce_speed[True-backward] 10.1507ms 9.1910ms 108.8017 Ops/s 110.5477 Ops/s $\color{#d91a1a}-1.58\%$
test_reinforce_speed[reduce-overhead-None] 3.5192ms 2.7303ms 366.2651 Ops/s 355.8296 Ops/s $\color{#35bf28}+2.93\%$
test_reinforce_speed[reduce-overhead-backward] 10.4991ms 8.8633ms 112.8245 Ops/s 108.2705 Ops/s $\color{#35bf28}+4.21\%$
test_iql_speed[False-None] 36.5653ms 32.7242ms 30.5584 Ops/s 29.7503 Ops/s $\color{#35bf28}+2.72\%$
test_iql_speed[False-backward] 50.0572ms 46.2242ms 21.6337 Ops/s 21.3299 Ops/s $\color{#35bf28}+1.42\%$
test_iql_speed[True-None] 13.6539ms 11.7092ms 85.4033 Ops/s 89.4494 Ops/s $\color{#d91a1a}-4.52\%$
test_iql_speed[True-backward] 24.5978ms 22.3158ms 44.8112 Ops/s 44.3969 Ops/s $\color{#35bf28}+0.93\%$
test_iql_speed[reduce-overhead-None] 12.2877ms 11.1363ms 89.7963 Ops/s 89.6070 Ops/s $\color{#35bf28}+0.21\%$
test_iql_speed[reduce-overhead-backward] 24.6601ms 22.5196ms 44.4058 Ops/s 42.7347 Ops/s $\color{#35bf28}+3.91\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 8.5217ms 5.3471ms 187.0158 Ops/s 166.1557 Ops/s $\textbf{\color{#35bf28}+12.55\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.1868ms 0.5563ms 1.7974 KOps/s 1.8448 KOps/s $\color{#d91a1a}-2.56\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8181ms 0.4992ms 2.0033 KOps/s 2.0135 KOps/s $\color{#d91a1a}-0.51\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.0568ms 5.1378ms 194.6364 Ops/s 196.8754 Ops/s $\color{#d91a1a}-1.14\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.2100ms 0.5481ms 1.8246 KOps/s 1.9708 KOps/s $\textbf{\color{#d91a1a}-7.42\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.9586ms 0.4963ms 2.0148 KOps/s 2.0603 KOps/s $\color{#d91a1a}-2.21\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.5959ms 1.6571ms 603.4524 Ops/s 602.5534 Ops/s $\color{#35bf28}+0.15\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 3.2167ms 1.6589ms 602.7950 Ops/s 629.8329 Ops/s $\color{#d91a1a}-4.29\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.0913ms 5.1822ms 192.9699 Ops/s 190.3205 Ops/s $\color{#35bf28}+1.39\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.4251ms 0.6524ms 1.5328 KOps/s 1.5156 KOps/s $\color{#35bf28}+1.14\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0800ms 0.6273ms 1.5942 KOps/s 1.5644 KOps/s $\color{#35bf28}+1.90\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.6997ms 4.9664ms 201.3522 Ops/s 200.2741 Ops/s $\color{#35bf28}+0.54\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 4.2316ms 0.5250ms 1.9048 KOps/s 1.9283 KOps/s $\color{#d91a1a}-1.22\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8951ms 0.5234ms 1.9105 KOps/s 1.9907 KOps/s $\color{#d91a1a}-4.03\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.3981ms 5.2260ms 191.3499 Ops/s 194.2241 Ops/s $\color{#d91a1a}-1.48\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.6046ms 0.5051ms 1.9799 KOps/s 1.9556 KOps/s $\color{#35bf28}+1.24\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7766ms 0.4895ms 2.0429 KOps/s 2.0967 KOps/s $\color{#d91a1a}-2.57\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.8657ms 5.1926ms 192.5815 Ops/s 194.0858 Ops/s $\color{#d91a1a}-0.78\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1651ms 0.6537ms 1.5297 KOps/s 422.9285 Ops/s $\textbf{\color{#35bf28}+261.69\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9578ms 0.6337ms 1.5781 KOps/s 1.5009 KOps/s $\textbf{\color{#35bf28}+5.14\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 8.3796ms 4.5572ms 219.4317 Ops/s 217.0373 Ops/s $\color{#35bf28}+1.10\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.3167ms 2.3670ms 422.4719 Ops/s 409.9306 Ops/s $\color{#35bf28}+3.06\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.9482ms 1.2802ms 781.1323 Ops/s 695.2196 Ops/s $\textbf{\color{#35bf28}+12.36\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.1186ms 4.1979ms 238.2134 Ops/s 224.8682 Ops/s $\textbf{\color{#35bf28}+5.93\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 6.7923ms 2.7426ms 364.6110 Ops/s 421.9745 Ops/s $\textbf{\color{#d91a1a}-13.59\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 5.7304ms 1.3169ms 759.3367 Ops/s 710.5594 Ops/s $\textbf{\color{#35bf28}+6.86\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4832s 14.8906ms 67.1563 Ops/s 226.1391 Ops/s $\textbf{\color{#d91a1a}-70.30\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 9.6323ms 2.7048ms 369.7151 Ops/s 388.5727 Ops/s $\color{#d91a1a}-4.85\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 6.2614ms 1.5288ms 654.0916 Ops/s 611.9942 Ops/s $\textbf{\color{#35bf28}+6.88\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 12.2517ms 11.3955ms 87.7536 Ops/s 81.4704 Ops/s $\textbf{\color{#35bf28}+7.71\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 17.8592ms 15.3543ms 65.1283 Ops/s 64.1818 Ops/s $\color{#35bf28}+1.47\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 21.9214ms 20.1168ms 49.7097 Ops/s 48.5388 Ops/s $\color{#35bf28}+2.41\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.8869ms 15.3355ms 65.2082 Ops/s 63.8065 Ops/s $\color{#35bf28}+2.20\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.3910ms 20.2627ms 49.3518 Ops/s 48.5364 Ops/s $\color{#35bf28}+1.68\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 19.5273ms 17.0298ms 58.7204 Ops/s 57.6198 Ops/s $\color{#35bf28}+1.91\%$

Copy link

github-actions bot commented Dec 15, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}15$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7695s 0.7611s 1.3140 Ops/s 1.2815 Ops/s $\color{#35bf28}+2.53\%$
test_transformed 1.0202s 1.0162s 0.9840 Ops/s 0.9865 Ops/s $\color{#d91a1a}-0.26\%$
test_serial 2.2153s 2.1708s 0.4607 Ops/s 0.4625 Ops/s $\color{#d91a1a}-0.39\%$
test_parallel 1.9589s 1.9536s 0.5119 Ops/s 0.5043 Ops/s $\color{#35bf28}+1.51\%$
test_step_mdp_speed[True-True-True-True-True] 0.2577ms 39.0398μs 25.6149 KOps/s 25.5877 KOps/s $\color{#35bf28}+0.11\%$
test_step_mdp_speed[True-True-True-True-False] 50.3510μs 23.0763μs 43.3345 KOps/s 43.7313 KOps/s $\color{#d91a1a}-0.91\%$
test_step_mdp_speed[True-True-True-False-True] 52.2510μs 21.5843μs 46.3300 KOps/s 46.5182 KOps/s $\color{#d91a1a}-0.40\%$
test_step_mdp_speed[True-True-True-False-False] 38.6200μs 12.8122μs 78.0505 KOps/s 79.0570 KOps/s $\color{#d91a1a}-1.27\%$
test_step_mdp_speed[True-True-False-True-True] 0.1077ms 42.8495μs 23.3375 KOps/s 23.9415 KOps/s $\color{#d91a1a}-2.52\%$
test_step_mdp_speed[True-True-False-True-False] 87.6010μs 24.7780μs 40.3584 KOps/s 40.4915 KOps/s $\color{#d91a1a}-0.33\%$
test_step_mdp_speed[True-True-False-False-True] 57.7100μs 24.6434μs 40.5788 KOps/s 42.4588 KOps/s $\color{#d91a1a}-4.43\%$
test_step_mdp_speed[True-True-False-False-False] 42.3610μs 15.1727μs 65.9080 KOps/s 66.7913 KOps/s $\color{#d91a1a}-1.32\%$
test_step_mdp_speed[True-False-True-True-True] 75.6110μs 45.2920μs 22.0790 KOps/s 22.4647 KOps/s $\color{#d91a1a}-1.72\%$
test_step_mdp_speed[True-False-True-True-False] 53.9210μs 27.5527μs 36.2941 KOps/s 36.2518 KOps/s $\color{#35bf28}+0.12\%$
test_step_mdp_speed[True-False-True-False-True] 86.0510μs 23.7596μs 42.0882 KOps/s 41.8999 KOps/s $\color{#35bf28}+0.45\%$
test_step_mdp_speed[True-False-True-False-False] 53.5310μs 15.2591μs 65.5345 KOps/s 67.2389 KOps/s $\color{#d91a1a}-2.53\%$
test_step_mdp_speed[True-False-False-True-True] 79.3300μs 46.6646μs 21.4295 KOps/s 21.4036 KOps/s $\color{#35bf28}+0.12\%$
test_step_mdp_speed[True-False-False-True-False] 58.8500μs 29.5615μs 33.8278 KOps/s 34.0146 KOps/s $\color{#d91a1a}-0.55\%$
test_step_mdp_speed[True-False-False-False-True] 56.0200μs 26.5391μs 37.6803 KOps/s 38.8783 KOps/s $\color{#d91a1a}-3.08\%$
test_step_mdp_speed[True-False-False-False-False] 84.3510μs 17.3125μs 57.7617 KOps/s 58.4995 KOps/s $\color{#d91a1a}-1.26\%$
test_step_mdp_speed[False-True-True-True-True] 73.2310μs 44.9566μs 22.2437 KOps/s 22.2317 KOps/s $\color{#35bf28}+0.05\%$
test_step_mdp_speed[False-True-True-True-False] 63.0110μs 27.4572μs 36.4203 KOps/s 36.3126 KOps/s $\color{#35bf28}+0.30\%$
test_step_mdp_speed[False-True-True-False-True] 56.7910μs 28.5441μs 35.0335 KOps/s 36.0642 KOps/s $\color{#d91a1a}-2.86\%$
test_step_mdp_speed[False-True-True-False-False] 42.4200μs 16.7744μs 59.6147 KOps/s 60.7077 KOps/s $\color{#d91a1a}-1.80\%$
test_step_mdp_speed[False-True-False-True-True] 0.1173ms 46.5209μs 21.4957 KOps/s 21.6308 KOps/s $\color{#d91a1a}-0.62\%$
test_step_mdp_speed[False-True-False-True-False] 64.8910μs 29.6115μs 33.7706 KOps/s 33.8926 KOps/s $\color{#d91a1a}-0.36\%$
test_step_mdp_speed[False-True-False-False-True] 3.2512ms 31.0005μs 32.2576 KOps/s 33.3065 KOps/s $\color{#d91a1a}-3.15\%$
test_step_mdp_speed[False-True-False-False-False] 45.7310μs 19.2205μs 52.0277 KOps/s 54.0132 KOps/s $\color{#d91a1a}-3.68\%$
test_step_mdp_speed[False-False-True-True-True] 87.1810μs 49.5259μs 20.1914 KOps/s 20.2385 KOps/s $\color{#d91a1a}-0.23\%$
test_step_mdp_speed[False-False-True-True-False] 68.6800μs 32.5237μs 30.7468 KOps/s 31.6058 KOps/s $\color{#d91a1a}-2.72\%$
test_step_mdp_speed[False-False-True-False-True] 61.1810μs 30.4432μs 32.8481 KOps/s 33.4403 KOps/s $\color{#d91a1a}-1.77\%$
test_step_mdp_speed[False-False-True-False-False] 48.5400μs 19.0409μs 52.5185 KOps/s 53.4136 KOps/s $\color{#d91a1a}-1.68\%$
test_step_mdp_speed[False-False-False-True-True] 0.1081ms 51.0968μs 19.5707 KOps/s 19.6502 KOps/s $\color{#d91a1a}-0.40\%$
test_step_mdp_speed[False-False-False-True-False] 64.9200μs 33.7914μs 29.5934 KOps/s 29.3496 KOps/s $\color{#35bf28}+0.83\%$
test_step_mdp_speed[False-False-False-False-True] 69.8610μs 31.8481μs 31.3990 KOps/s 31.7874 KOps/s $\color{#d91a1a}-1.22\%$
test_step_mdp_speed[False-False-False-False-False] 46.6510μs 20.9363μs 47.7639 KOps/s 48.3283 KOps/s $\color{#d91a1a}-1.17\%$
test_values[generalized_advantage_estimate-True-True] 25.7911ms 25.2459ms 39.6104 Ops/s 39.9058 Ops/s $\color{#d91a1a}-0.74\%$
test_values[vec_generalized_advantage_estimate-True-True] 96.7100ms 2.8439ms 351.6255 Ops/s 339.1473 Ops/s $\color{#35bf28}+3.68\%$
test_values[td0_return_estimate-False-False] 0.1094ms 83.4487μs 11.9834 KOps/s 12.1025 KOps/s $\color{#d91a1a}-0.98\%$
test_values[td1_return_estimate-False-False] 56.6410ms 56.4020ms 17.7299 Ops/s 17.8396 Ops/s $\color{#d91a1a}-0.61\%$
test_values[vec_td1_return_estimate-False-False] 1.3418ms 1.1035ms 906.2174 Ops/s 910.5988 Ops/s $\color{#d91a1a}-0.48\%$
test_values[td_lambda_return_estimate-True-False] 90.2291ms 89.6574ms 11.1536 Ops/s 11.0889 Ops/s $\color{#35bf28}+0.58\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3728ms 1.1005ms 908.7035 Ops/s 912.7664 Ops/s $\color{#d91a1a}-0.45\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 25.1404ms 25.0260ms 39.9584 Ops/s 40.4915 Ops/s $\color{#d91a1a}-1.32\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0551ms 0.7831ms 1.2770 KOps/s 1.3068 KOps/s $\color{#d91a1a}-2.28\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8337ms 0.6912ms 1.4468 KOps/s 1.4660 KOps/s $\color{#d91a1a}-1.31\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5386ms 1.4994ms 666.9126 Ops/s 670.2050 Ops/s $\color{#d91a1a}-0.49\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7697ms 0.7020ms 1.4245 KOps/s 1.4392 KOps/s $\color{#d91a1a}-1.02\%$
test_dqn_speed[False-None] 7.1756ms 1.5502ms 645.0932 Ops/s 656.1830 Ops/s $\color{#d91a1a}-1.69\%$
test_dqn_speed[False-backward] 2.2406ms 2.1738ms 460.0173 Ops/s 459.8870 Ops/s $\color{#35bf28}+0.03\%$
test_dqn_speed[True-None] 0.6202ms 0.5342ms 1.8720 KOps/s 1.8406 KOps/s $\color{#35bf28}+1.71\%$
test_dqn_speed[True-backward] 1.2592ms 1.1997ms 833.5495 Ops/s 889.4994 Ops/s $\textbf{\color{#d91a1a}-6.29\%}$
test_dqn_speed[reduce-overhead-None] 0.6137ms 0.5512ms 1.8143 KOps/s 1.7929 KOps/s $\color{#35bf28}+1.19\%$
test_dqn_speed[reduce-overhead-backward] 1.1298ms 1.0600ms 943.4275 Ops/s 917.1089 Ops/s $\color{#35bf28}+2.87\%$
test_ddpg_speed[False-None] 3.2707ms 2.8942ms 345.5132 Ops/s 341.9841 Ops/s $\color{#35bf28}+1.03\%$
test_ddpg_speed[False-backward] 4.7436ms 4.3286ms 231.0225 Ops/s 231.5353 Ops/s $\color{#d91a1a}-0.22\%$
test_ddpg_speed[True-None] 1.1724ms 1.0870ms 919.9919 Ops/s 916.2828 Ops/s $\color{#35bf28}+0.40\%$
test_ddpg_speed[True-backward] 2.3419ms 2.3032ms 434.1830 Ops/s 454.6137 Ops/s $\color{#d91a1a}-4.49\%$
test_ddpg_speed[reduce-overhead-None] 1.2005ms 1.1410ms 876.4133 Ops/s 912.2779 Ops/s $\color{#d91a1a}-3.93\%$
test_ddpg_speed[reduce-overhead-backward] 1.8890ms 1.7685ms 565.4360 Ops/s 600.3633 Ops/s $\textbf{\color{#d91a1a}-5.82\%}$
test_sac_speed[False-None] 8.5888ms 8.1849ms 122.1768 Ops/s 121.2131 Ops/s $\color{#35bf28}+0.79\%$
test_sac_speed[False-backward] 12.0107ms 11.5323ms 86.7132 Ops/s 87.8533 Ops/s $\color{#d91a1a}-1.30\%$
test_sac_speed[True-None] 1.6316ms 1.5379ms 650.2366 Ops/s 625.1595 Ops/s $\color{#35bf28}+4.01\%$
test_sac_speed[True-backward] 3.6185ms 3.4698ms 288.2011 Ops/s 302.9157 Ops/s $\color{#d91a1a}-4.86\%$
test_sac_speed[reduce-overhead-None] 22.3068ms 12.4595ms 80.2601 Ops/s 79.7508 Ops/s $\color{#35bf28}+0.64\%$
test_sac_speed[reduce-overhead-backward] 1.5998ms 1.5120ms 661.3942 Ops/s 657.2565 Ops/s $\color{#35bf28}+0.63\%$
test_redq_speed[False-None] 8.2921ms 7.5944ms 131.6760 Ops/s 129.9142 Ops/s $\color{#35bf28}+1.36\%$
test_redq_speed[False-backward] 12.9828ms 11.9275ms 83.8399 Ops/s 82.7866 Ops/s $\color{#35bf28}+1.27\%$
test_redq_speed[True-None] 2.0940ms 2.0187ms 495.3770 Ops/s 472.2870 Ops/s $\color{#35bf28}+4.89\%$
test_redq_speed[True-backward] 4.2499ms 3.9089ms 255.8260 Ops/s 254.8906 Ops/s $\color{#35bf28}+0.37\%$
test_redq_speed[reduce-overhead-None] 2.0967ms 2.0107ms 497.3285 Ops/s 483.5491 Ops/s $\color{#35bf28}+2.85\%$
test_redq_speed[reduce-overhead-backward] 3.9374ms 3.7643ms 265.6540 Ops/s 255.1555 Ops/s $\color{#35bf28}+4.11\%$
test_redq_deprec_speed[False-None] 9.6634ms 9.2196ms 108.4649 Ops/s 106.4868 Ops/s $\color{#35bf28}+1.86\%$
test_redq_deprec_speed[False-backward] 13.0256ms 12.3872ms 80.7286 Ops/s 77.9958 Ops/s $\color{#35bf28}+3.50\%$
test_redq_deprec_speed[True-None] 2.5628ms 2.3740ms 421.2376 Ops/s 420.0411 Ops/s $\color{#35bf28}+0.28\%$
test_redq_deprec_speed[True-backward] 4.6646ms 4.2472ms 235.4515 Ops/s 233.5260 Ops/s $\color{#35bf28}+0.82\%$
test_redq_deprec_speed[reduce-overhead-None] 2.4776ms 2.3588ms 423.9500 Ops/s 413.8261 Ops/s $\color{#35bf28}+2.45\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.6442ms 4.2545ms 235.0442 Ops/s 245.9541 Ops/s $\color{#d91a1a}-4.44\%$
test_td3_speed[False-None] 8.1096ms 8.0486ms 124.2454 Ops/s 123.6007 Ops/s $\color{#35bf28}+0.52\%$
test_td3_speed[False-backward] 11.1985ms 10.7562ms 92.9699 Ops/s 94.6276 Ops/s $\color{#d91a1a}-1.75\%$
test_td3_speed[True-None] 1.6411ms 1.5831ms 631.6819 Ops/s 630.3326 Ops/s $\color{#35bf28}+0.21\%$
test_td3_speed[True-backward] 3.2437ms 3.1430ms 318.1701 Ops/s 317.3853 Ops/s $\color{#35bf28}+0.25\%$
test_td3_speed[reduce-overhead-None] 51.2923ms 26.1546ms 38.2341 Ops/s 36.9325 Ops/s $\color{#35bf28}+3.52\%$
test_td3_speed[reduce-overhead-backward] 1.3666ms 1.2944ms 772.5464 Ops/s 687.8851 Ops/s $\textbf{\color{#35bf28}+12.31\%}$
test_cql_speed[False-None] 17.6981ms 17.0815ms 58.5429 Ops/s 58.1624 Ops/s $\color{#35bf28}+0.65\%$
test_cql_speed[False-backward] 23.6751ms 22.5865ms 44.2742 Ops/s 43.2818 Ops/s $\color{#35bf28}+2.29\%$
test_cql_speed[True-None] 3.0676ms 2.9701ms 336.6906 Ops/s 335.1347 Ops/s $\color{#35bf28}+0.46\%$
test_cql_speed[True-backward] 5.7706ms 5.3505ms 186.8994 Ops/s 182.7523 Ops/s $\color{#35bf28}+2.27\%$
test_cql_speed[reduce-overhead-None] 21.2403ms 13.1050ms 76.3068 Ops/s 76.2627 Ops/s $\color{#35bf28}+0.06\%$
test_cql_speed[reduce-overhead-backward] 1.7564ms 1.6895ms 591.8782 Ops/s 646.4877 Ops/s $\textbf{\color{#d91a1a}-8.45\%}$
test_a2c_speed[False-None] 3.4194ms 3.2698ms 305.8289 Ops/s 303.1041 Ops/s $\color{#35bf28}+0.90\%$
test_a2c_speed[False-backward] 7.0740ms 6.5668ms 152.2803 Ops/s 156.5863 Ops/s $\color{#d91a1a}-2.75\%$
test_a2c_speed[True-None] 1.1186ms 1.0085ms 991.5747 Ops/s 983.5991 Ops/s $\color{#35bf28}+0.81\%$
test_a2c_speed[True-backward] 2.8297ms 2.7713ms 360.8388 Ops/s 357.1018 Ops/s $\color{#35bf28}+1.05\%$
test_a2c_speed[reduce-overhead-None] 20.9170ms 11.3773ms 87.8943 Ops/s 86.6235 Ops/s $\color{#35bf28}+1.47\%$
test_a2c_speed[reduce-overhead-backward] 1.1455ms 1.1153ms 896.5958 Ops/s 871.7797 Ops/s $\color{#35bf28}+2.85\%$
test_ppo_speed[False-None] 3.9296ms 3.7636ms 265.7035 Ops/s 263.6528 Ops/s $\color{#35bf28}+0.78\%$
test_ppo_speed[False-backward] 7.6602ms 7.3294ms 136.4363 Ops/s 137.7736 Ops/s $\color{#d91a1a}-0.97\%$
test_ppo_speed[True-None] 1.0298ms 0.9575ms 1.0444 KOps/s 1.0503 KOps/s $\color{#d91a1a}-0.56\%$
test_ppo_speed[True-backward] 2.7663ms 2.7213ms 367.4667 Ops/s 387.6021 Ops/s $\textbf{\color{#d91a1a}-5.19\%}$
test_ppo_speed[reduce-overhead-None] 0.5765ms 0.5060ms 1.9763 KOps/s 1.8738 KOps/s $\textbf{\color{#35bf28}+5.47\%}$
test_ppo_speed[reduce-overhead-backward] 1.1465ms 1.1110ms 900.1096 Ops/s 990.5926 Ops/s $\textbf{\color{#d91a1a}-9.13\%}$
test_reinforce_speed[False-None] 2.4651ms 2.3158ms 431.8089 Ops/s 429.9324 Ops/s $\color{#35bf28}+0.44\%$
test_reinforce_speed[False-backward] 3.8824ms 3.4424ms 290.4930 Ops/s 296.8458 Ops/s $\color{#d91a1a}-2.14\%$
test_reinforce_speed[True-None] 0.9824ms 0.8400ms 1.1905 KOps/s 1.1886 KOps/s $\color{#35bf28}+0.16\%$
test_reinforce_speed[True-backward] 2.7259ms 2.5783ms 387.8587 Ops/s 403.9608 Ops/s $\color{#d91a1a}-3.99\%$
test_reinforce_speed[reduce-overhead-None] 21.9737ms 11.6305ms 85.9805 Ops/s 87.6083 Ops/s $\color{#d91a1a}-1.86\%$
test_reinforce_speed[reduce-overhead-backward] 1.2112ms 1.1669ms 856.9965 Ops/s 928.7062 Ops/s $\textbf{\color{#d91a1a}-7.72\%}$
test_iql_speed[False-None] 9.8770ms 9.3862ms 106.5397 Ops/s 106.1688 Ops/s $\color{#35bf28}+0.35\%$
test_iql_speed[False-backward] 14.1348ms 13.5130ms 74.0029 Ops/s 74.9601 Ops/s $\color{#d91a1a}-1.28\%$
test_iql_speed[True-None] 1.9990ms 1.7779ms 562.4527 Ops/s 573.8218 Ops/s $\color{#d91a1a}-1.98\%$
test_iql_speed[True-backward] 4.3158ms 4.2441ms 235.6229 Ops/s 231.1174 Ops/s $\color{#35bf28}+1.95\%$
test_iql_speed[reduce-overhead-None] 20.0102ms 11.4811ms 87.0995 Ops/s 88.9297 Ops/s $\color{#d91a1a}-2.06\%$
test_iql_speed[reduce-overhead-backward] 1.5258ms 1.4436ms 692.7073 Ops/s 622.9311 Ops/s $\textbf{\color{#35bf28}+11.20\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.9765ms 6.4649ms 154.6811 Ops/s 152.4315 Ops/s $\color{#35bf28}+1.48\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5987ms 0.3610ms 2.7704 KOps/s 2.6115 KOps/s $\textbf{\color{#35bf28}+6.08\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5639ms 0.3694ms 2.7074 KOps/s 2.9296 KOps/s $\textbf{\color{#d91a1a}-7.58\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.5576ms 6.2549ms 159.8751 Ops/s 158.7196 Ops/s $\color{#35bf28}+0.73\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9124ms 0.2765ms 3.6162 KOps/s 3.8325 KOps/s $\textbf{\color{#d91a1a}-5.64\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5497ms 0.2734ms 3.6574 KOps/s 4.1396 KOps/s $\textbf{\color{#d91a1a}-11.65\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.4827ms 1.2733ms 785.3488 Ops/s 786.5545 Ops/s $\color{#d91a1a}-0.15\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5356ms 1.2794ms 781.5911 Ops/s 736.0995 Ops/s $\textbf{\color{#35bf28}+6.18\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.6029ms 6.4079ms 156.0583 Ops/s 155.6932 Ops/s $\color{#35bf28}+0.23\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8795ms 0.4929ms 2.0287 KOps/s 2.3541 KOps/s $\textbf{\color{#d91a1a}-13.82\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7350ms 0.4618ms 2.1653 KOps/s 2.5252 KOps/s $\textbf{\color{#d91a1a}-14.25\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.1816ms 6.2514ms 159.9638 Ops/s 159.0617 Ops/s $\color{#35bf28}+0.57\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.0164ms 0.3425ms 2.9194 KOps/s 3.1522 KOps/s $\textbf{\color{#d91a1a}-7.39\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4695ms 0.2904ms 3.4437 KOps/s 2.9933 KOps/s $\textbf{\color{#35bf28}+15.05\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.5638ms 6.2006ms 161.2760 Ops/s 160.0089 Ops/s $\color{#35bf28}+0.79\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6041ms 0.3639ms 2.7482 KOps/s 3.7540 KOps/s $\textbf{\color{#d91a1a}-26.79\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5783ms 0.3084ms 3.2423 KOps/s 3.6465 KOps/s $\textbf{\color{#d91a1a}-11.08\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5650ms 6.4026ms 156.1862 Ops/s 155.5175 Ops/s $\color{#35bf28}+0.43\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.7194ms 0.4547ms 2.1992 KOps/s 2.1205 KOps/s $\color{#35bf28}+3.71\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8075ms 0.4335ms 2.3070 KOps/s 2.5269 KOps/s $\textbf{\color{#d91a1a}-8.70\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.9744ms 5.3609ms 186.5357 Ops/s 187.6892 Ops/s $\color{#d91a1a}-0.61\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 9.0796ms 2.0650ms 484.2572 Ops/s 437.6130 Ops/s $\textbf{\color{#35bf28}+10.66\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.9508ms 0.9542ms 1.0480 KOps/s 866.8749 Ops/s $\textbf{\color{#35bf28}+20.89\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.0056ms 5.4045ms 185.0322 Ops/s 188.2596 Ops/s $\color{#d91a1a}-1.71\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 10.1825ms 2.0775ms 481.3455 Ops/s 479.8096 Ops/s $\color{#35bf28}+0.32\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 3.4309ms 1.1402ms 877.0034 Ops/s 761.6109 Ops/s $\textbf{\color{#35bf28}+15.15\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5080s 15.7241ms 63.5965 Ops/s 33.1910 Ops/s $\textbf{\color{#35bf28}+91.61\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 4.0095ms 1.8479ms 541.1632 Ops/s 453.0608 Ops/s $\textbf{\color{#35bf28}+19.45\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.5013ms 1.2334ms 810.7854 Ops/s 721.3668 Ops/s $\textbf{\color{#35bf28}+12.40\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.6557ms 13.4192ms 74.5201 Ops/s 72.9961 Ops/s $\color{#35bf28}+2.09\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 20.3633ms 17.8683ms 55.9650 Ops/s 56.8788 Ops/s $\color{#d91a1a}-1.61\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.6977ms 18.1840ms 54.9935 Ops/s 53.7246 Ops/s $\color{#35bf28}+2.36\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.3532ms 18.0367ms 55.4426 Ops/s 56.0372 Ops/s $\color{#d91a1a}-1.06\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 18.4838ms 17.7366ms 56.3805 Ops/s 53.9212 Ops/s $\color{#35bf28}+4.56\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.2766ms 19.2080ms 52.0616 Ops/s 52.0903 Ops/s $\color{#d91a1a}-0.06\%$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: 4fb68db2bb0fc1b4246fe13574913592886ca6cf
Pull Request resolved: #2654
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: 04c9a8ae697c5bcc9411b1a2b8f4ab260f13aa9b
Pull Request resolved: #2654
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: f46774ac30a40a414d8eac67a6192680ae06c4f8
Pull Request resolved: #2654
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: eb556f60f46d355348311567cacaf1146b2e88df
Pull Request resolved: #2654
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: 193ae53957ef9c460585eac68123aeb30ab473d1
Pull Request resolved: #2654
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: 202e8a48b78cc03277f928cbab696d26253bc0ee
Pull Request resolved: #2654
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: e9544b47c591d3af7ffdc409d39099d725339ef9
Pull Request resolved: #2654
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: e98c9d6080de542a4c782e8cbf5ad33c5dc0e5bb
Pull Request resolved: #2654
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: 297bc9b8e8342e03e6697e9cf31275efa64a2698
Pull Request resolved: #2654
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: 36444170dc40db057ba5ee06b70157842f970b22
Pull Request resolved: #2654
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: 4bc5fe3d06ba9eb7840487d3535429c89b6ea72c
Pull Request resolved: #2654
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: 9cd1e6d90f69232e7a0f042bbcbe1527d1e6ee26
Pull Request resolved: #2654
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: 4b63265e6dfb75f31e82a02661544e6049b9a405
Pull Request resolved: #2654
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: 04ebbe09714518b328243d4680035d0527465ac5
Pull Request resolved: #2654
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: a439301cf4cac58474cbb910de71a1f190c316ed
Pull Request resolved: #2654
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: a439301cf4cac58474cbb910de71a1f190c316ed
Pull Request resolved: #2654
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: abbb3048f33c9f7f6a623e32e139871093ea74fa
Pull Request resolved: #2654
@vmoens vmoens added quality code quality BE Better errors, logs, docs or test utils labels Dec 16, 2024
@vmoens vmoens merged commit 5159272 into gh/vmoens/55/base Dec 16, 2024
70 of 78 checks passed
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: abbb3048f33c9f7f6a623e32e139871093ea74fa
Pull Request resolved: #2654
@vmoens vmoens deleted the gh/vmoens/55/head branch December 16, 2024 02:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BE Better errors, logs, docs or test utils CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. quality code quality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants