Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Benchmark] Add benchmark for compiled ReplayBuffer.extend/sample #2514

Merged
merged 1 commit into from
Oct 25, 2024

Conversation

kurtamohler
Copy link
Collaborator

@kurtamohler kurtamohler commented Oct 25, 2024

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Oct 25, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2514

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 5 Unrelated Failures

As of commit 163b4a9 with merge base 0f29c7e (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

kurtamohler added a commit that referenced this pull request Oct 25, 2024
ghstack-source-id: d4562697e2c1a8392cf5bdcadb50f8b7b6939e41
Pull Request resolved: #2514
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 25, 2024
@kurtamohler kurtamohler requested a review from vmoens October 25, 2024 01:50
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 145. Improved: $\large\color{#35bf28}15$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7247s 0.7240s 1.3813 Ops/s 1.3727 Ops/s $\color{#35bf28}+0.62\%$
test_transformed 1.0575s 0.9806s 1.0198 Ops/s 1.0359 Ops/s $\color{#d91a1a}-1.56\%$
test_serial 2.1948s 2.1173s 0.4723 Ops/s 0.4750 Ops/s $\color{#d91a1a}-0.57\%$
test_parallel 2.0456s 1.9947s 0.5013 Ops/s 0.4953 Ops/s $\color{#35bf28}+1.22\%$
test_step_mdp_speed[True-True-True-True-True] 0.1402ms 37.3060μs 26.8054 KOps/s 27.4671 KOps/s $\color{#d91a1a}-2.41\%$
test_step_mdp_speed[True-True-True-True-False] 48.0110μs 22.3893μs 44.6641 KOps/s 46.4162 KOps/s $\color{#d91a1a}-3.77\%$
test_step_mdp_speed[True-True-True-False-True] 55.7010μs 20.4659μs 48.8616 KOps/s 51.3450 KOps/s $\color{#d91a1a}-4.84\%$
test_step_mdp_speed[True-True-True-False-False] 40.1700μs 12.0857μs 82.7421 KOps/s 85.3896 KOps/s $\color{#d91a1a}-3.10\%$
test_step_mdp_speed[True-True-False-True-True] 79.1020μs 39.5953μs 25.2555 KOps/s 25.2010 KOps/s $\color{#35bf28}+0.22\%$
test_step_mdp_speed[True-True-False-True-False] 51.8710μs 24.4135μs 40.9609 KOps/s 42.6227 KOps/s $\color{#d91a1a}-3.90\%$
test_step_mdp_speed[True-True-False-False-True] 49.6510μs 22.4099μs 44.6231 KOps/s 45.5995 KOps/s $\color{#d91a1a}-2.14\%$
test_step_mdp_speed[True-True-False-False-False] 46.1410μs 14.4897μs 69.0145 KOps/s 71.9102 KOps/s $\color{#d91a1a}-4.03\%$
test_step_mdp_speed[True-False-True-True-True] 75.8410μs 42.4008μs 23.5845 KOps/s 24.1734 KOps/s $\color{#d91a1a}-2.44\%$
test_step_mdp_speed[True-False-True-True-False] 56.6410μs 27.1576μs 36.8221 KOps/s 38.3525 KOps/s $\color{#d91a1a}-3.99\%$
test_step_mdp_speed[True-False-True-False-True] 41.7200μs 22.7198μs 44.0144 KOps/s 45.6571 KOps/s $\color{#d91a1a}-3.60\%$
test_step_mdp_speed[True-False-True-False-False] 63.8520μs 14.3096μs 69.8831 KOps/s 70.7012 KOps/s $\color{#d91a1a}-1.16\%$
test_step_mdp_speed[True-False-False-True-True] 70.1320μs 44.6390μs 22.4019 KOps/s 22.9234 KOps/s $\color{#d91a1a}-2.27\%$
test_step_mdp_speed[True-False-False-True-False] 59.3610μs 29.6376μs 33.7409 KOps/s 34.8623 KOps/s $\color{#d91a1a}-3.22\%$
test_step_mdp_speed[True-False-False-False-True] 51.5710μs 25.5423μs 39.1507 KOps/s 40.9970 KOps/s $\color{#d91a1a}-4.50\%$
test_step_mdp_speed[True-False-False-False-False] 43.5210μs 17.0035μs 58.8114 KOps/s 60.1328 KOps/s $\color{#d91a1a}-2.20\%$
test_step_mdp_speed[False-True-True-True-True] 70.3910μs 42.3111μs 23.6344 KOps/s 24.0845 KOps/s $\color{#d91a1a}-1.87\%$
test_step_mdp_speed[False-True-True-True-False] 0.1124ms 27.2736μs 36.6655 KOps/s 38.2646 KOps/s $\color{#d91a1a}-4.18\%$
test_step_mdp_speed[False-True-True-False-True] 59.3910μs 26.6351μs 37.5445 KOps/s 36.3315 KOps/s $\color{#35bf28}+3.34\%$
test_step_mdp_speed[False-True-True-False-False] 43.9210μs 16.8166μs 59.4651 KOps/s 59.5847 KOps/s $\color{#d91a1a}-0.20\%$
test_step_mdp_speed[False-True-False-True-True] 79.9520μs 44.5922μs 22.4254 KOps/s 22.6234 KOps/s $\color{#d91a1a}-0.87\%$
test_step_mdp_speed[False-True-False-True-False] 60.3610μs 29.0832μs 34.3841 KOps/s 34.7079 KOps/s $\color{#d91a1a}-0.93\%$
test_step_mdp_speed[False-True-False-False-True] 3.3194ms 29.2343μs 34.2063 KOps/s 33.9409 KOps/s $\color{#35bf28}+0.78\%$
test_step_mdp_speed[False-True-False-False-False] 81.9610μs 18.9552μs 52.7560 KOps/s 52.6142 KOps/s $\color{#35bf28}+0.27\%$
test_step_mdp_speed[False-False-True-True-True] 89.7820μs 46.8545μs 21.3427 KOps/s 21.2063 KOps/s $\color{#35bf28}+0.64\%$
test_step_mdp_speed[False-False-True-True-False] 60.4620μs 31.8429μs 31.4042 KOps/s 31.3104 KOps/s $\color{#35bf28}+0.30\%$
test_step_mdp_speed[False-False-True-False-True] 61.2610μs 29.6077μs 33.7750 KOps/s 33.3798 KOps/s $\color{#35bf28}+1.18\%$
test_step_mdp_speed[False-False-True-False-False] 54.4910μs 19.4096μs 51.5209 KOps/s 52.5213 KOps/s $\color{#d91a1a}-1.90\%$
test_step_mdp_speed[False-False-False-True-True] 90.4420μs 49.4698μs 20.2143 KOps/s 20.6294 KOps/s $\color{#d91a1a}-2.01\%$
test_step_mdp_speed[False-False-False-True-False] 61.6010μs 34.0769μs 29.3454 KOps/s 29.5927 KOps/s $\color{#d91a1a}-0.84\%$
test_step_mdp_speed[False-False-False-False-True] 70.7810μs 31.2198μs 32.0310 KOps/s 32.4987 KOps/s $\color{#d91a1a}-1.44\%$
test_step_mdp_speed[False-False-False-False-False] 54.7010μs 21.7524μs 45.9719 KOps/s 47.0103 KOps/s $\color{#d91a1a}-2.21\%$
test_values[generalized_advantage_estimate-True-True] 25.0647ms 24.6525ms 40.5639 Ops/s 40.4933 Ops/s $\color{#35bf28}+0.17\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1005s 2.8998ms 344.8467 Ops/s 327.2836 Ops/s $\textbf{\color{#35bf28}+5.37\%}$
test_values[td0_return_estimate-False-False] 89.4820μs 66.7092μs 14.9904 KOps/s 15.0094 KOps/s $\color{#d91a1a}-0.13\%$
test_values[td1_return_estimate-False-False] 55.4044ms 55.1335ms 18.1378 Ops/s 18.3790 Ops/s $\color{#d91a1a}-1.31\%$
test_values[vec_td1_return_estimate-False-False] 1.2786ms 1.0738ms 931.2530 Ops/s 930.1485 Ops/s $\color{#35bf28}+0.12\%$
test_values[td_lambda_return_estimate-True-False] 87.5751ms 87.0271ms 11.4907 Ops/s 11.5489 Ops/s $\color{#d91a1a}-0.50\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3109ms 1.0708ms 933.8865 Ops/s 934.4878 Ops/s $\color{#d91a1a}-0.06\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.5043ms 24.2367ms 41.2598 Ops/s 41.1772 Ops/s $\color{#35bf28}+0.20\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0262ms 0.7420ms 1.3478 KOps/s 1.3162 KOps/s $\color{#35bf28}+2.40\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7505ms 0.6593ms 1.5168 KOps/s 1.5100 KOps/s $\color{#35bf28}+0.45\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5095ms 1.4662ms 682.0284 Ops/s 681.1020 Ops/s $\color{#35bf28}+0.14\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7999ms 0.6751ms 1.4814 KOps/s 1.4776 KOps/s $\color{#35bf28}+0.26\%$
test_dqn_speed[False-None] 1.4279ms 1.2902ms 775.0899 Ops/s 664.9738 Ops/s $\textbf{\color{#35bf28}+16.56\%}$
test_dqn_speed[False-backward] 1.9084ms 1.8272ms 547.2888 Ops/s 534.7718 Ops/s $\color{#35bf28}+2.34\%$
test_dqn_speed[True-None] 1.1353ms 0.5652ms 1.7693 KOps/s 1.7575 KOps/s $\color{#35bf28}+0.67\%$
test_dqn_speed[True-backward] 1.0520ms 1.0046ms 995.4476 Ops/s 969.1008 Ops/s $\color{#35bf28}+2.72\%$
test_dqn_speed[reduce-overhead-None] 0.8686ms 0.5627ms 1.7772 KOps/s 1.8066 KOps/s $\color{#d91a1a}-1.63\%$
test_dqn_speed[reduce-overhead-backward] 1.0569ms 1.0037ms 996.3025 Ops/s 966.9975 Ops/s $\color{#35bf28}+3.03\%$
test_ddpg_speed[False-None] 2.9753ms 2.6614ms 375.7418 Ops/s 372.4022 Ops/s $\color{#35bf28}+0.90\%$
test_ddpg_speed[False-backward] 4.0437ms 3.9160ms 255.3656 Ops/s 252.6653 Ops/s $\color{#35bf28}+1.07\%$
test_ddpg_speed[True-None] 1.5983ms 1.2376ms 807.9930 Ops/s 809.8698 Ops/s $\color{#d91a1a}-0.23\%$
test_ddpg_speed[True-backward] 2.2646ms 2.2095ms 452.5898 Ops/s 443.4394 Ops/s $\color{#35bf28}+2.06\%$
test_ddpg_speed[reduce-overhead-None] 1.4864ms 1.2507ms 799.5811 Ops/s 792.4085 Ops/s $\color{#35bf28}+0.91\%$
test_ddpg_speed[reduce-overhead-backward] 2.2829ms 2.2201ms 450.4347 Ops/s 449.9720 Ops/s $\color{#35bf28}+0.10\%$
test_sac_speed[False-None] 8.4696ms 7.5000ms 133.3335 Ops/s 130.8433 Ops/s $\color{#35bf28}+1.90\%$
test_sac_speed[False-backward] 11.2840ms 10.8089ms 92.5167 Ops/s 92.1935 Ops/s $\color{#35bf28}+0.35\%$
test_sac_speed[True-None] 2.4229ms 2.0273ms 493.2718 Ops/s 489.7019 Ops/s $\color{#35bf28}+0.73\%$
test_sac_speed[True-backward] 4.1301ms 4.0016ms 249.9003 Ops/s 237.3357 Ops/s $\textbf{\color{#35bf28}+5.29\%}$
test_sac_speed[reduce-overhead-None] 2.2656ms 2.0467ms 488.5914 Ops/s 491.5316 Ops/s $\color{#d91a1a}-0.60\%$
test_sac_speed[reduce-overhead-backward] 4.1939ms 3.9980ms 250.1250 Ops/s 252.0157 Ops/s $\color{#d91a1a}-0.75\%$
test_redq_speed[False-None] 14.6527ms 10.1475ms 98.5465 Ops/s 102.2321 Ops/s $\color{#d91a1a}-3.61\%$
test_redq_speed[False-backward] 17.8514ms 17.0725ms 58.5738 Ops/s 40.5369 Ops/s $\textbf{\color{#35bf28}+44.50\%}$
test_redq_speed[True-None] 4.0246ms 3.6573ms 273.4292 Ops/s 277.3685 Ops/s $\color{#d91a1a}-1.42\%$
test_redq_speed[True-backward] 8.8831ms 8.6645ms 115.4136 Ops/s 117.2609 Ops/s $\color{#d91a1a}-1.58\%$
test_redq_speed[reduce-overhead-None] 3.8370ms 3.5500ms 281.6889 Ops/s 275.6458 Ops/s $\color{#35bf28}+2.19\%$
test_redq_speed[reduce-overhead-backward] 8.9146ms 8.5917ms 116.3918 Ops/s 116.6084 Ops/s $\color{#d91a1a}-0.19\%$
test_redq_deprec_speed[False-None] 12.1523ms 10.4558ms 95.6406 Ops/s 95.3588 Ops/s $\color{#35bf28}+0.30\%$
test_redq_deprec_speed[False-backward] 16.0275ms 15.4281ms 64.8168 Ops/s 65.6461 Ops/s $\color{#d91a1a}-1.26\%$
test_redq_deprec_speed[True-None] 3.5739ms 3.2286ms 309.7335 Ops/s 307.5825 Ops/s $\color{#35bf28}+0.70\%$
test_redq_deprec_speed[True-backward] 7.3944ms 7.1557ms 139.7480 Ops/s 138.2697 Ops/s $\color{#35bf28}+1.07\%$
test_redq_deprec_speed[reduce-overhead-None] 3.5831ms 3.2161ms 310.9342 Ops/s 312.1585 Ops/s $\color{#d91a1a}-0.39\%$
test_redq_deprec_speed[reduce-overhead-backward] 7.3844ms 7.1707ms 139.4559 Ops/s 135.5173 Ops/s $\color{#35bf28}+2.91\%$
test_td3_speed[False-None] 7.4924ms 7.4419ms 134.3735 Ops/s 131.7094 Ops/s $\color{#35bf28}+2.02\%$
test_td3_speed[False-backward] 10.4929ms 10.3039ms 97.0510 Ops/s 94.6241 Ops/s $\color{#35bf28}+2.56\%$
test_td3_speed[True-None] 1.9429ms 1.9085ms 523.9648 Ops/s 511.7892 Ops/s $\color{#35bf28}+2.38\%$
test_td3_speed[True-backward] 3.8283ms 3.7245ms 268.4920 Ops/s 250.4049 Ops/s $\textbf{\color{#35bf28}+7.22\%}$
test_td3_speed[reduce-overhead-None] 1.9321ms 1.9049ms 524.9566 Ops/s 514.6173 Ops/s $\color{#35bf28}+2.01\%$
test_td3_speed[reduce-overhead-backward] 3.8187ms 3.7160ms 269.1040 Ops/s 271.7981 Ops/s $\color{#d91a1a}-0.99\%$
test_cql_speed[False-None] 28.3456ms 24.9959ms 40.0066 Ops/s 40.3531 Ops/s $\color{#d91a1a}-0.86\%$
test_cql_speed[False-backward] 39.3962ms 35.0683ms 28.5158 Ops/s 29.6124 Ops/s $\color{#d91a1a}-3.70\%$
test_cql_speed[True-None] 11.2509ms 10.9828ms 91.0514 Ops/s 91.3287 Ops/s $\color{#d91a1a}-0.30\%$
test_cql_speed[True-backward] 17.2658ms 16.7836ms 59.5820 Ops/s 58.6804 Ops/s $\color{#35bf28}+1.54\%$
test_cql_speed[reduce-overhead-None] 11.3718ms 10.9529ms 91.2998 Ops/s 90.9005 Ops/s $\color{#35bf28}+0.44\%$
test_cql_speed[reduce-overhead-backward] 20.1524ms 17.4028ms 57.4621 Ops/s 58.8264 Ops/s $\color{#d91a1a}-2.32\%$
test_a2c_speed[False-None] 5.5766ms 5.3118ms 188.2591 Ops/s 184.9553 Ops/s $\color{#35bf28}+1.79\%$
test_a2c_speed[False-backward] 12.0810ms 11.7288ms 85.2602 Ops/s 84.5344 Ops/s $\color{#35bf28}+0.86\%$
test_a2c_speed[True-None] 3.4025ms 3.0478ms 328.1057 Ops/s 318.9109 Ops/s $\color{#35bf28}+2.88\%$
test_a2c_speed[True-backward] 8.8598ms 8.5875ms 116.4483 Ops/s 117.3641 Ops/s $\color{#d91a1a}-0.78\%$
test_a2c_speed[reduce-overhead-None] 3.2055ms 3.0303ms 330.0013 Ops/s 323.7980 Ops/s $\color{#35bf28}+1.92\%$
test_a2c_speed[reduce-overhead-backward] 8.7837ms 8.5077ms 117.5401 Ops/s 117.6216 Ops/s $\color{#d91a1a}-0.07\%$
test_ppo_speed[False-None] 7.1790ms 5.7691ms 173.3358 Ops/s 176.6455 Ops/s $\color{#d91a1a}-1.87\%$
test_ppo_speed[False-backward] 12.7603ms 12.3683ms 80.8520 Ops/s 82.4711 Ops/s $\color{#d91a1a}-1.96\%$
test_ppo_speed[True-None] 3.7449ms 3.4612ms 288.9200 Ops/s 288.7630 Ops/s $\color{#35bf28}+0.05\%$
test_ppo_speed[True-backward] 8.6944ms 8.3749ms 119.4044 Ops/s 121.4437 Ops/s $\color{#d91a1a}-1.68\%$
test_ppo_speed[reduce-overhead-None] 3.8403ms 3.4615ms 288.8888 Ops/s 291.1117 Ops/s $\color{#d91a1a}-0.76\%$
test_ppo_speed[reduce-overhead-backward] 8.5206ms 8.3118ms 120.3112 Ops/s 119.5477 Ops/s $\color{#35bf28}+0.64\%$
test_reinforce_speed[False-None] 4.6561ms 4.4026ms 227.1406 Ops/s 229.8540 Ops/s $\color{#d91a1a}-1.18\%$
test_reinforce_speed[False-backward] 7.5601ms 7.3207ms 136.5985 Ops/s 138.7638 Ops/s $\color{#d91a1a}-1.56\%$
test_reinforce_speed[True-None] 2.6480ms 2.2321ms 448.0174 Ops/s 434.6713 Ops/s $\color{#35bf28}+3.07\%$
test_reinforce_speed[True-backward] 7.4554ms 7.1969ms 138.9489 Ops/s 139.5256 Ops/s $\color{#d91a1a}-0.41\%$
test_reinforce_speed[reduce-overhead-None] 2.7940ms 2.2424ms 445.9444 Ops/s 441.2806 Ops/s $\color{#35bf28}+1.06\%$
test_reinforce_speed[reduce-overhead-backward] 7.3432ms 7.0986ms 140.8729 Ops/s 139.3393 Ops/s $\color{#35bf28}+1.10\%$
test_iql_speed[False-None] 20.2910ms 19.2475ms 51.9548 Ops/s 49.8796 Ops/s $\color{#35bf28}+4.16\%$
test_iql_speed[False-backward] 31.0354ms 29.9305ms 33.4107 Ops/s 32.6096 Ops/s $\color{#35bf28}+2.46\%$
test_iql_speed[True-None] 7.2765ms 6.7690ms 147.7326 Ops/s 144.8366 Ops/s $\color{#35bf28}+2.00\%$
test_iql_speed[True-backward] 16.0485ms 15.5279ms 64.4000 Ops/s 61.8607 Ops/s $\color{#35bf28}+4.10\%$
test_iql_speed[reduce-overhead-None] 7.4591ms 6.7972ms 147.1190 Ops/s 146.9741 Ops/s $\color{#35bf28}+0.10\%$
test_iql_speed[reduce-overhead-backward] 16.0029ms 15.6602ms 63.8561 Ops/s 63.6490 Ops/s $\color{#35bf28}+0.33\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.4926ms 6.2138ms 160.9334 Ops/s 163.1347 Ops/s $\color{#d91a1a}-1.35\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.0323ms 0.2815ms 3.5524 KOps/s 4.2542 KOps/s $\textbf{\color{#d91a1a}-16.50\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6512ms 0.2861ms 3.4957 KOps/s 4.4918 KOps/s $\textbf{\color{#d91a1a}-22.18\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.2351ms 5.9609ms 167.7589 Ops/s 169.0786 Ops/s $\color{#d91a1a}-0.78\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.9915ms 0.2716ms 3.6823 KOps/s 2.9688 KOps/s $\textbf{\color{#35bf28}+24.03\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4488ms 0.2274ms 4.3969 KOps/s 3.5313 KOps/s $\textbf{\color{#35bf28}+24.51\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.5803ms 1.2038ms 830.6957 Ops/s 715.8863 Ops/s $\textbf{\color{#35bf28}+16.04\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.3404ms 1.1586ms 863.1044 Ops/s 740.5887 Ops/s $\textbf{\color{#35bf28}+16.54\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.2818ms 6.1443ms 162.7516 Ops/s 165.0999 Ops/s $\color{#d91a1a}-1.42\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1837ms 0.4163ms 2.4021 KOps/s 2.2818 KOps/s $\textbf{\color{#35bf28}+5.27\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7004ms 0.4112ms 2.4316 KOps/s 2.4038 KOps/s $\color{#35bf28}+1.16\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.1830ms 5.9673ms 167.5794 Ops/s 167.3422 Ops/s $\color{#35bf28}+0.14\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.9106ms 0.2724ms 3.6708 KOps/s 3.2721 KOps/s $\textbf{\color{#35bf28}+12.18\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 1.3634ms 0.3187ms 3.1376 KOps/s 4.6383 KOps/s $\textbf{\color{#d91a1a}-32.35\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.5525ms 5.9585ms 167.8269 Ops/s 171.4990 Ops/s $\color{#d91a1a}-2.14\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.3803ms 0.2350ms 4.2552 KOps/s 3.3630 KOps/s $\textbf{\color{#35bf28}+26.53\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5145ms 0.2426ms 4.1227 KOps/s 4.7574 KOps/s $\textbf{\color{#d91a1a}-13.34\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4144ms 6.1171ms 163.4760 Ops/s 165.6992 Ops/s $\color{#d91a1a}-1.34\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9920ms 0.4936ms 2.0258 KOps/s 2.3811 KOps/s $\textbf{\color{#d91a1a}-14.92\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7407ms 0.4777ms 2.0932 KOps/s 2.7683 KOps/s $\textbf{\color{#d91a1a}-24.39\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.0488ms 5.2628ms 190.0142 Ops/s 35.3554 Ops/s $\textbf{\color{#35bf28}+437.44\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 10.8026ms 2.0389ms 490.4699 Ops/s 496.4853 Ops/s $\color{#d91a1a}-1.21\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 2.0543ms 1.0585ms 944.7509 Ops/s 845.0143 Ops/s $\textbf{\color{#35bf28}+11.80\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4240s 13.7213ms 72.8792 Ops/s 186.7988 Ops/s $\textbf{\color{#d91a1a}-60.99\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 7.4681ms 1.9767ms 505.8949 Ops/s 489.3915 Ops/s $\color{#35bf28}+3.37\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 8.2266ms 1.1978ms 834.8823 Ops/s 947.7880 Ops/s $\textbf{\color{#d91a1a}-11.91\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 8.4493ms 5.5099ms 181.4930 Ops/s 180.8643 Ops/s $\color{#35bf28}+0.35\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 9.2125ms 2.1594ms 463.0904 Ops/s 473.5613 Ops/s $\color{#d91a1a}-2.21\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.8778ms 1.3357ms 748.6959 Ops/s 711.6548 Ops/s $\textbf{\color{#35bf28}+5.20\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000-100-True] 45.0436ms 43.0096ms 23.2506 Ops/s 22.4890 Ops/s $\color{#35bf28}+3.39\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000-100-False] 10.4728ms 9.9044ms 100.9653 Ops/s 99.9544 Ops/s $\color{#35bf28}+1.01\%$

@vmoens vmoens added the Benchmarks rl/benchmark changes label Oct 25, 2024
Copy link
Contributor

@vmoens vmoens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! thanks!

@vmoens vmoens closed this in 5e03a55 Oct 25, 2024
@vmoens vmoens merged commit 163b4a9 into gh/kurtamohler/1/base Oct 25, 2024
71 of 78 checks passed
@vmoens vmoens deleted the gh/kurtamohler/1/head branch October 25, 2024 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Benchmarks rl/benchmark changes CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants