Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] IQL compatibility with compile #2649

Merged
merged 38 commits into from
Dec 16, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 14, 2024

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Dec 14, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2649

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 7 Unrelated Failures

As of commit b73eea2 with merge base f149811 (image):

NEW FAILURE - The following job has failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 14, 2024
vmoens added a commit that referenced this pull request Dec 14, 2024
ghstack-source-id: 84267fc499950c43714db823accde31fa708e693
Pull Request resolved: #2649
Copy link

github-actions bot commented Dec 14, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}2$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4279s 0.4251s 2.3522 Ops/s 2.2006 Ops/s $\textbf{\color{#35bf28}+6.89\%}$
test_transformed 0.6140s 0.6080s 1.6447 Ops/s 1.5808 Ops/s $\color{#35bf28}+4.04\%$
test_serial 1.3409s 1.3368s 0.7480 Ops/s 0.7319 Ops/s $\color{#35bf28}+2.20\%$
test_parallel 1.3884s 1.2990s 0.7698 Ops/s 0.7677 Ops/s $\color{#35bf28}+0.27\%$
test_step_mdp_speed[True-True-True-True-True] 0.1345ms 29.2679μs 34.1671 KOps/s 33.7469 KOps/s $\color{#35bf28}+1.25\%$
test_step_mdp_speed[True-True-True-True-False] 63.7360μs 17.1851μs 58.1899 KOps/s 57.1304 KOps/s $\color{#35bf28}+1.85\%$
test_step_mdp_speed[True-True-True-False-True] 45.1350μs 16.6647μs 60.0070 KOps/s 57.9191 KOps/s $\color{#35bf28}+3.60\%$
test_step_mdp_speed[True-True-True-False-False] 60.2630μs 9.8818μs 101.1962 KOps/s 98.2633 KOps/s $\color{#35bf28}+2.98\%$
test_step_mdp_speed[True-True-False-True-True] 62.2660μs 31.8195μs 31.4273 KOps/s 30.6864 KOps/s $\color{#35bf28}+2.41\%$
test_step_mdp_speed[True-True-False-True-False] 47.5880μs 19.2186μs 52.0330 KOps/s 50.7504 KOps/s $\color{#35bf28}+2.53\%$
test_step_mdp_speed[True-True-False-False-True] 70.0510μs 18.5744μs 53.8375 KOps/s 52.0267 KOps/s $\color{#35bf28}+3.48\%$
test_step_mdp_speed[True-True-False-False-False] 33.5620μs 11.6828μs 85.5956 KOps/s 83.8192 KOps/s $\color{#35bf28}+2.12\%$
test_step_mdp_speed[True-False-True-True-True] 85.6200μs 33.2820μs 30.0462 KOps/s 29.1694 KOps/s $\color{#35bf28}+3.01\%$
test_step_mdp_speed[True-False-True-True-False] 53.6500μs 20.8622μs 47.9336 KOps/s 46.1470 KOps/s $\color{#35bf28}+3.87\%$
test_step_mdp_speed[True-False-True-False-True] 66.9540μs 18.2550μs 54.7794 KOps/s 52.2416 KOps/s $\color{#35bf28}+4.86\%$
test_step_mdp_speed[True-False-True-False-False] 36.2980μs 11.6142μs 86.1016 KOps/s 83.9961 KOps/s $\color{#35bf28}+2.51\%$
test_step_mdp_speed[True-False-False-True-True] 87.4830μs 34.7998μs 28.7358 KOps/s 27.6723 KOps/s $\color{#35bf28}+3.84\%$
test_step_mdp_speed[True-False-False-True-False] 67.6560μs 22.8067μs 43.8467 KOps/s 42.9030 KOps/s $\color{#35bf28}+2.20\%$
test_step_mdp_speed[True-False-False-False-True] 0.6330ms 20.3612μs 49.1130 KOps/s 47.9811 KOps/s $\color{#35bf28}+2.36\%$
test_step_mdp_speed[True-False-False-False-False] 48.1200μs 13.4330μs 74.4437 KOps/s 73.6703 KOps/s $\color{#35bf28}+1.05\%$
test_step_mdp_speed[False-True-True-True-True] 75.6710μs 33.4052μs 29.9355 KOps/s 29.0514 KOps/s $\color{#35bf28}+3.04\%$
test_step_mdp_speed[False-True-True-True-False] 61.8850μs 21.1331μs 47.3192 KOps/s 46.6661 KOps/s $\color{#35bf28}+1.40\%$
test_step_mdp_speed[False-True-True-False-True] 80.1900μs 21.1248μs 47.3377 KOps/s 46.0491 KOps/s $\color{#35bf28}+2.80\%$
test_step_mdp_speed[False-True-True-False-False] 41.0960μs 12.9481μs 77.2313 KOps/s 76.0836 KOps/s $\color{#35bf28}+1.51\%$
test_step_mdp_speed[False-True-False-True-True] 98.9270μs 34.8387μs 28.7037 KOps/s 27.8530 KOps/s $\color{#35bf28}+3.05\%$
test_step_mdp_speed[False-True-False-True-False] 50.7040μs 22.7405μs 43.9745 KOps/s 43.1688 KOps/s $\color{#35bf28}+1.87\%$
test_step_mdp_speed[False-True-False-False-True] 2.7081ms 22.5516μs 44.3428 KOps/s 43.4988 KOps/s $\color{#35bf28}+1.94\%$
test_step_mdp_speed[False-True-False-False-False] 39.2130μs 14.5309μs 68.8187 KOps/s 68.0222 KOps/s $\color{#35bf28}+1.17\%$
test_step_mdp_speed[False-False-True-True-True] 78.1650μs 36.5831μs 27.3350 KOps/s 26.6079 KOps/s $\color{#35bf28}+2.73\%$
test_step_mdp_speed[False-False-True-True-False] 60.4220μs 24.5077μs 40.8035 KOps/s 40.5005 KOps/s $\color{#35bf28}+0.75\%$
test_step_mdp_speed[False-False-True-False-True] 52.7590μs 22.2605μs 44.9226 KOps/s 44.1546 KOps/s $\color{#35bf28}+1.74\%$
test_step_mdp_speed[False-False-True-False-False] 69.1890μs 14.5075μs 68.9299 KOps/s 67.5287 KOps/s $\color{#35bf28}+2.08\%$
test_step_mdp_speed[False-False-False-True-True] 78.0460μs 37.9156μs 26.3744 KOps/s 25.7954 KOps/s $\color{#35bf28}+2.24\%$
test_step_mdp_speed[False-False-False-True-False] 73.4990μs 25.7431μs 38.8453 KOps/s 38.1473 KOps/s $\color{#35bf28}+1.83\%$
test_step_mdp_speed[False-False-False-False-True] 66.4620μs 23.7319μs 42.1373 KOps/s 41.1583 KOps/s $\color{#35bf28}+2.38\%$
test_step_mdp_speed[False-False-False-False-False] 0.6680ms 16.2239μs 61.6374 KOps/s 60.8098 KOps/s $\color{#35bf28}+1.36\%$
test_values[generalized_advantage_estimate-True-True] 9.7972ms 9.3600ms 106.8376 Ops/s 104.8235 Ops/s $\color{#35bf28}+1.92\%$
test_values[vec_generalized_advantage_estimate-True-True] 35.6025ms 33.3833ms 29.9551 Ops/s 30.0580 Ops/s $\color{#d91a1a}-0.34\%$
test_values[td0_return_estimate-False-False] 0.2356ms 0.1753ms 5.7057 KOps/s 5.6397 KOps/s $\color{#35bf28}+1.17\%$
test_values[td1_return_estimate-False-False] 24.5731ms 23.4162ms 42.7054 Ops/s 42.2595 Ops/s $\color{#35bf28}+1.06\%$
test_values[vec_td1_return_estimate-False-False] 35.4091ms 33.4491ms 29.8962 Ops/s 29.9842 Ops/s $\color{#d91a1a}-0.29\%$
test_values[td_lambda_return_estimate-True-False] 36.7125ms 33.5277ms 29.8261 Ops/s 29.5244 Ops/s $\color{#35bf28}+1.02\%$
test_values[vec_td_lambda_return_estimate-True-False] 37.8779ms 33.3344ms 29.9990 Ops/s 29.6383 Ops/s $\color{#35bf28}+1.22\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.7779ms 8.2466ms 121.2622 Ops/s 118.5745 Ops/s $\color{#35bf28}+2.27\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.2565ms 1.9969ms 500.7681 Ops/s 489.8195 Ops/s $\color{#35bf28}+2.24\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5970ms 0.3631ms 2.7542 KOps/s 2.7576 KOps/s $\color{#d91a1a}-0.12\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 55.2448ms 43.8796ms 22.7896 Ops/s 23.3485 Ops/s $\color{#d91a1a}-2.39\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.2910ms 3.0355ms 329.4374 Ops/s 304.2088 Ops/s $\textbf{\color{#35bf28}+8.29\%}$
test_dqn_speed[False-None] 5.9039ms 1.3650ms 732.5880 Ops/s 718.8535 Ops/s $\color{#35bf28}+1.91\%$
test_dqn_speed[False-backward] 2.3004ms 1.8488ms 540.8899 Ops/s 534.8034 Ops/s $\color{#35bf28}+1.14\%$
test_dqn_speed[True-None] 0.7047ms 0.4693ms 2.1307 KOps/s 2.1443 KOps/s $\color{#d91a1a}-0.63\%$
test_dqn_speed[True-backward] 0.9212ms 0.8728ms 1.1458 KOps/s 1.1232 KOps/s $\color{#35bf28}+2.01\%$
test_dqn_speed[reduce-overhead-None] 0.8397ms 0.4677ms 2.1381 KOps/s 2.1619 KOps/s $\color{#d91a1a}-1.10\%$
test_dqn_speed[reduce-overhead-backward] 0.9347ms 0.8751ms 1.1428 KOps/s 1.1248 KOps/s $\color{#35bf28}+1.60\%$
test_ddpg_speed[False-None] 3.6108ms 2.8744ms 347.8928 Ops/s 348.5771 Ops/s $\color{#d91a1a}-0.20\%$
test_ddpg_speed[False-backward] 4.2886ms 3.9837ms 251.0239 Ops/s 251.3626 Ops/s $\color{#d91a1a}-0.13\%$
test_ddpg_speed[True-None] 1.3176ms 0.9948ms 1.0052 KOps/s 1.0038 KOps/s $\color{#35bf28}+0.14\%$
test_ddpg_speed[True-backward] 1.9364ms 1.8674ms 535.5072 Ops/s 453.0908 Ops/s $\textbf{\color{#35bf28}+18.19\%}$
test_ddpg_speed[reduce-overhead-None] 1.5145ms 0.9904ms 1.0097 KOps/s 1.0029 KOps/s $\color{#35bf28}+0.68\%$
test_ddpg_speed[reduce-overhead-backward] 1.9239ms 1.8676ms 535.4508 Ops/s 527.8950 Ops/s $\color{#35bf28}+1.43\%$
test_sac_speed[False-None] 9.4315ms 7.9932ms 125.1061 Ops/s 123.2972 Ops/s $\color{#35bf28}+1.47\%$
test_sac_speed[False-backward] 12.1640ms 10.7304ms 93.1936 Ops/s 92.4589 Ops/s $\color{#35bf28}+0.79\%$
test_sac_speed[True-None] 2.3926ms 1.8054ms 553.8858 Ops/s 549.5418 Ops/s $\color{#35bf28}+0.79\%$
test_sac_speed[True-backward] 3.5781ms 3.4884ms 286.6631 Ops/s 267.3014 Ops/s $\textbf{\color{#35bf28}+7.24\%}$
test_sac_speed[reduce-overhead-None] 2.2846ms 1.8115ms 552.0226 Ops/s 547.1742 Ops/s $\color{#35bf28}+0.89\%$
test_sac_speed[reduce-overhead-backward] 3.6087ms 3.4925ms 286.3252 Ops/s 285.6512 Ops/s $\color{#35bf28}+0.24\%$
test_redq_speed[False-None] 14.6412ms 12.8770ms 77.6581 Ops/s 76.8925 Ops/s $\color{#35bf28}+1.00\%$
test_redq_speed[False-backward] 23.5545ms 21.8260ms 45.8168 Ops/s 45.4850 Ops/s $\color{#35bf28}+0.73\%$
test_redq_speed[True-None] 5.3790ms 4.5151ms 221.4779 Ops/s 223.7328 Ops/s $\color{#d91a1a}-1.01\%$
test_redq_speed[True-backward] 12.5692ms 11.5996ms 86.2096 Ops/s 84.8463 Ops/s $\color{#35bf28}+1.61\%$
test_redq_speed[reduce-overhead-None] 4.9099ms 4.4090ms 226.8086 Ops/s 215.4939 Ops/s $\textbf{\color{#35bf28}+5.25\%}$
test_redq_speed[reduce-overhead-backward] 12.2947ms 11.8514ms 84.3779 Ops/s 80.5430 Ops/s $\color{#35bf28}+4.76\%$
test_redq_deprec_speed[False-None] 14.5987ms 12.5906ms 79.4243 Ops/s 75.8877 Ops/s $\color{#35bf28}+4.66\%$
test_redq_deprec_speed[False-backward] 19.6002ms 18.1817ms 55.0004 Ops/s 51.2846 Ops/s $\textbf{\color{#35bf28}+7.25\%}$
test_redq_deprec_speed[True-None] 5.1285ms 3.5227ms 283.8693 Ops/s 269.7097 Ops/s $\textbf{\color{#35bf28}+5.25\%}$
test_redq_deprec_speed[True-backward] 9.1217ms 7.9847ms 125.2403 Ops/s 108.1999 Ops/s $\textbf{\color{#35bf28}+15.75\%}$
test_redq_deprec_speed[reduce-overhead-None] 4.0935ms 3.5247ms 283.7096 Ops/s 264.5377 Ops/s $\textbf{\color{#35bf28}+7.25\%}$
test_redq_deprec_speed[reduce-overhead-backward] 9.2554ms 8.2915ms 120.6049 Ops/s 126.7764 Ops/s $\color{#d91a1a}-4.87\%$
test_td3_speed[False-None] 8.5072ms 8.0572ms 124.1130 Ops/s 124.1143 Ops/s $-0.00\%$
test_td3_speed[False-backward] 10.7814ms 10.3394ms 96.7175 Ops/s 95.8102 Ops/s $\color{#35bf28}+0.95\%$
test_td3_speed[True-None] 1.8631ms 1.6806ms 595.0324 Ops/s 583.5346 Ops/s $\color{#35bf28}+1.97\%$
test_td3_speed[True-backward] 3.3284ms 3.2717ms 305.6497 Ops/s 300.5191 Ops/s $\color{#35bf28}+1.71\%$
test_td3_speed[reduce-overhead-None] 1.7695ms 1.6782ms 595.8924 Ops/s 574.1802 Ops/s $\color{#35bf28}+3.78\%$
test_td3_speed[reduce-overhead-backward] 3.3957ms 3.2866ms 304.2681 Ops/s 298.0261 Ops/s $\color{#35bf28}+2.09\%$
test_cql_speed[False-None] 39.1220ms 36.1523ms 27.6608 Ops/s 26.7776 Ops/s $\color{#35bf28}+3.30\%$
test_cql_speed[False-backward] 52.6784ms 47.2623ms 21.1585 Ops/s 21.3532 Ops/s $\color{#d91a1a}-0.91\%$
test_cql_speed[True-None] 16.5328ms 15.3386ms 65.1948 Ops/s 64.3001 Ops/s $\color{#35bf28}+1.39\%$
test_cql_speed[True-backward] 22.8920ms 22.0283ms 45.3961 Ops/s 45.2096 Ops/s $\color{#35bf28}+0.41\%$
test_cql_speed[reduce-overhead-None] 17.5923ms 15.6258ms 63.9967 Ops/s 65.0407 Ops/s $\color{#d91a1a}-1.61\%$
test_cql_speed[reduce-overhead-backward] 23.1881ms 22.3938ms 44.6552 Ops/s 45.2354 Ops/s $\color{#d91a1a}-1.28\%$
test_a2c_speed[False-None] 7.9562ms 7.1839ms 139.1994 Ops/s 138.6444 Ops/s $\color{#35bf28}+0.40\%$
test_a2c_speed[False-backward] 15.1794ms 14.2915ms 69.9719 Ops/s 70.5141 Ops/s $\color{#d91a1a}-0.77\%$
test_a2c_speed[True-None] 4.8773ms 4.1773ms 239.3900 Ops/s 239.1149 Ops/s $\color{#35bf28}+0.12\%$
test_a2c_speed[True-backward] 10.9271ms 10.5524ms 94.7648 Ops/s 93.9433 Ops/s $\color{#35bf28}+0.87\%$
test_a2c_speed[reduce-overhead-None] 4.8377ms 4.1647ms 240.1124 Ops/s 237.6295 Ops/s $\color{#35bf28}+1.04\%$
test_a2c_speed[reduce-overhead-backward] 10.9164ms 10.5238ms 95.0224 Ops/s 93.8649 Ops/s $\color{#35bf28}+1.23\%$
test_ppo_speed[False-None] 8.4461ms 7.3177ms 136.6556 Ops/s 134.0541 Ops/s $\color{#35bf28}+1.94\%$
test_ppo_speed[False-backward] 15.6406ms 14.4601ms 69.1559 Ops/s 66.2598 Ops/s $\color{#35bf28}+4.37\%$
test_ppo_speed[True-None] 4.0474ms 3.6525ms 273.7871 Ops/s 268.5758 Ops/s $\color{#35bf28}+1.94\%$
test_ppo_speed[True-backward] 10.5257ms 9.8063ms 101.9747 Ops/s 103.8360 Ops/s $\color{#d91a1a}-1.79\%$
test_ppo_speed[reduce-overhead-None] 4.0273ms 3.6568ms 273.4665 Ops/s 270.5824 Ops/s $\color{#35bf28}+1.07\%$
test_ppo_speed[reduce-overhead-backward] 10.0981ms 9.5363ms 104.8629 Ops/s 104.2253 Ops/s $\color{#35bf28}+0.61\%$
test_reinforce_speed[False-None] 7.8455ms 6.4369ms 155.3553 Ops/s 152.7097 Ops/s $\color{#35bf28}+1.73\%$
test_reinforce_speed[False-backward] 9.8557ms 9.6375ms 103.7608 Ops/s 102.8359 Ops/s $\color{#35bf28}+0.90\%$
test_reinforce_speed[True-None] 3.2204ms 2.6144ms 382.5007 Ops/s 378.6358 Ops/s $\color{#35bf28}+1.02\%$
test_reinforce_speed[True-backward] 8.8367ms 8.5338ms 117.1814 Ops/s 116.5614 Ops/s $\color{#35bf28}+0.53\%$
test_reinforce_speed[reduce-overhead-None] 3.1617ms 2.6188ms 381.8591 Ops/s 376.9338 Ops/s $\color{#35bf28}+1.31\%$
test_reinforce_speed[reduce-overhead-backward] 8.8728ms 8.4705ms 118.0563 Ops/s 116.0365 Ops/s $\color{#35bf28}+1.74\%$
test_iql_speed[False-None] 39.0456ms 32.3586ms 30.9037 Ops/s 31.3454 Ops/s $\color{#d91a1a}-1.41\%$
test_iql_speed[False-backward] 46.5111ms 44.6851ms 22.3788 Ops/s 22.3752 Ops/s $\color{#35bf28}+0.02\%$
test_iql_speed[True-None] 11.9462ms 10.4531ms 95.6656 Ops/s 90.7367 Ops/s $\textbf{\color{#35bf28}+5.43\%}$
test_iql_speed[True-backward] 21.9634ms 21.1338ms 47.3177 Ops/s 46.2674 Ops/s $\color{#35bf28}+2.27\%$
test_iql_speed[reduce-overhead-None] 11.0972ms 10.4367ms 95.8153 Ops/s 94.8855 Ops/s $\color{#35bf28}+0.98\%$
test_iql_speed[reduce-overhead-backward] 26.4063ms 21.6475ms 46.1946 Ops/s 46.8226 Ops/s $\color{#d91a1a}-1.34\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.9924ms 4.7879ms 208.8620 Ops/s 203.3949 Ops/s $\color{#35bf28}+2.69\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.0004ms 0.4981ms 2.0075 KOps/s 1.9516 KOps/s $\color{#35bf28}+2.86\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8676ms 0.4769ms 2.0968 KOps/s 2.0754 KOps/s $\color{#35bf28}+1.03\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.4404ms 4.5681ms 218.9117 Ops/s 212.9318 Ops/s $\color{#35bf28}+2.81\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.4305ms 0.4891ms 2.0446 KOps/s 2.0161 KOps/s $\color{#35bf28}+1.42\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6684ms 0.4628ms 2.1607 KOps/s 2.1543 KOps/s $\color{#35bf28}+0.30\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.3920ms 1.6141ms 619.5538 Ops/s 610.8801 Ops/s $\color{#35bf28}+1.42\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.2347ms 1.5649ms 639.0157 Ops/s 631.0837 Ops/s $\color{#35bf28}+1.26\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.6305ms 4.7081ms 212.3979 Ops/s 207.6757 Ops/s $\color{#35bf28}+2.27\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.9505ms 0.6341ms 1.5771 KOps/s 1.5644 KOps/s $\color{#35bf28}+0.81\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0318ms 0.6069ms 1.6478 KOps/s 1.6257 KOps/s $\color{#35bf28}+1.36\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.0532ms 4.6753ms 213.8892 Ops/s 214.1012 Ops/s $\color{#d91a1a}-0.10\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.8396ms 0.5082ms 1.9678 KOps/s 1.9449 KOps/s $\color{#35bf28}+1.18\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7072ms 0.4797ms 2.0847 KOps/s 2.0721 KOps/s $\color{#35bf28}+0.61\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.0208ms 4.5148ms 221.4951 Ops/s 214.7045 Ops/s $\color{#35bf28}+3.16\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.1718ms 0.4948ms 2.0212 KOps/s 1.9850 KOps/s $\color{#35bf28}+1.82\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6506ms 0.4638ms 2.1560 KOps/s 2.1481 KOps/s $\color{#35bf28}+0.37\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.4540ms 4.7201ms 211.8588 Ops/s 210.0701 Ops/s $\color{#35bf28}+0.85\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.2644ms 0.6356ms 1.5732 KOps/s 488.7276 Ops/s $\textbf{\color{#35bf28}+221.90\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8084ms 0.6083ms 1.6439 KOps/s 1.5952 KOps/s $\color{#35bf28}+3.05\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.5127ms 4.1804ms 239.2119 Ops/s 232.9428 Ops/s $\color{#35bf28}+2.69\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.3818ms 2.1972ms 455.1279 Ops/s 439.0250 Ops/s $\color{#35bf28}+3.67\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.1029ms 1.3725ms 728.5917 Ops/s 767.8641 Ops/s $\textbf{\color{#d91a1a}-5.11\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.3773s 11.6997ms 85.4720 Ops/s 237.2419 Ops/s $\textbf{\color{#d91a1a}-63.97\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 6.9289ms 2.2991ms 434.9549 Ops/s 418.0883 Ops/s $\color{#35bf28}+4.03\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 4.7258ms 1.2863ms 777.4535 Ops/s 792.2578 Ops/s $\color{#d91a1a}-1.87\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 9.5615ms 4.4291ms 225.7771 Ops/s 236.3487 Ops/s $\color{#d91a1a}-4.47\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 9.5915ms 2.4305ms 411.4363 Ops/s 413.5989 Ops/s $\color{#d91a1a}-0.52\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 4.2589ms 1.4015ms 713.5354 Ops/s 710.8524 Ops/s $\color{#35bf28}+0.38\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 11.8771ms 11.2377ms 88.9864 Ops/s 81.3736 Ops/s $\textbf{\color{#35bf28}+9.36\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 15.8660ms 14.8437ms 67.3688 Ops/s 67.0317 Ops/s $\color{#35bf28}+0.50\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 21.5672ms 19.8703ms 50.3263 Ops/s 48.4522 Ops/s $\color{#35bf28}+3.87\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.2398ms 15.1393ms 66.0534 Ops/s 66.2552 Ops/s $\color{#d91a1a}-0.30\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 20.3038ms 19.9016ms 50.2472 Ops/s 48.5177 Ops/s $\color{#35bf28}+3.56\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 18.1487ms 16.5098ms 60.5700 Ops/s 59.0065 Ops/s $\color{#35bf28}+2.65\%$

Copy link

github-actions bot commented Dec 14, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}17$. Worsened: $\large\color{#d91a1a}13$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7353s 0.7335s 1.3633 Ops/s 1.3256 Ops/s $\color{#35bf28}+2.85\%$
test_transformed 0.9861s 0.9810s 1.0194 Ops/s 1.0215 Ops/s $\color{#d91a1a}-0.21\%$
test_serial 2.1518s 2.1179s 0.4722 Ops/s 0.4687 Ops/s $\color{#35bf28}+0.74\%$
test_parallel 1.9664s 1.9346s 0.5169 Ops/s 0.5018 Ops/s $\color{#35bf28}+3.00\%$
test_step_mdp_speed[True-True-True-True-True] 0.2265ms 38.0063μs 26.3114 KOps/s 26.3969 KOps/s $\color{#d91a1a}-0.32\%$
test_step_mdp_speed[True-True-True-True-False] 60.8230μs 22.5471μs 44.3516 KOps/s 45.7603 KOps/s $\color{#d91a1a}-3.08\%$
test_step_mdp_speed[True-True-True-False-True] 49.5430μs 20.9651μs 47.6982 KOps/s 47.9618 KOps/s $\color{#d91a1a}-0.55\%$
test_step_mdp_speed[True-True-True-False-False] 38.3920μs 12.4428μs 80.3676 KOps/s 80.8391 KOps/s $\color{#d91a1a}-0.58\%$
test_step_mdp_speed[True-True-False-True-True] 71.4140μs 41.7655μs 23.9432 KOps/s 24.3162 KOps/s $\color{#d91a1a}-1.53\%$
test_step_mdp_speed[True-True-False-True-False] 53.0530μs 24.7536μs 40.3982 KOps/s 41.7600 KOps/s $\color{#d91a1a}-3.26\%$
test_step_mdp_speed[True-True-False-False-True] 51.9430μs 23.6854μs 42.2202 KOps/s 43.5705 KOps/s $\color{#d91a1a}-3.10\%$
test_step_mdp_speed[True-True-False-False-False] 47.1730μs 14.5649μs 68.6581 KOps/s 69.1311 KOps/s $\color{#d91a1a}-0.68\%$
test_step_mdp_speed[True-False-True-True-True] 71.3140μs 43.3706μs 23.0571 KOps/s 23.2053 KOps/s $\color{#d91a1a}-0.64\%$
test_step_mdp_speed[True-False-True-True-False] 90.5550μs 25.8305μs 38.7139 KOps/s 37.9993 KOps/s $\color{#35bf28}+1.88\%$
test_step_mdp_speed[True-False-True-False-True] 50.4630μs 23.5746μs 42.4186 KOps/s 42.7971 KOps/s $\color{#d91a1a}-0.88\%$
test_step_mdp_speed[True-False-True-False-False] 44.4420μs 14.4630μs 69.1420 KOps/s 69.2413 KOps/s $\color{#d91a1a}-0.14\%$
test_step_mdp_speed[True-False-False-True-True] 84.8750μs 45.1807μs 22.1334 KOps/s 22.4039 KOps/s $\color{#d91a1a}-1.21\%$
test_step_mdp_speed[True-False-False-True-False] 95.4560μs 28.3330μs 35.2946 KOps/s 34.9352 KOps/s $\color{#35bf28}+1.03\%$
test_step_mdp_speed[True-False-False-False-True] 56.7930μs 25.5368μs 39.1592 KOps/s 39.6671 KOps/s $\color{#d91a1a}-1.28\%$
test_step_mdp_speed[True-False-False-False-False] 46.1530μs 16.6817μs 59.9458 KOps/s 60.4872 KOps/s $\color{#d91a1a}-0.89\%$
test_step_mdp_speed[False-True-True-True-True] 79.1250μs 42.9193μs 23.2995 KOps/s 23.0502 KOps/s $\color{#35bf28}+1.08\%$
test_step_mdp_speed[False-True-True-True-False] 64.6640μs 26.7299μs 37.4112 KOps/s 37.9486 KOps/s $\color{#d91a1a}-1.42\%$
test_step_mdp_speed[False-True-True-False-True] 53.6030μs 26.9156μs 37.1532 KOps/s 37.5667 KOps/s $\color{#d91a1a}-1.10\%$
test_step_mdp_speed[False-True-True-False-False] 53.7140μs 16.0804μs 62.1875 KOps/s 62.2936 KOps/s $\color{#d91a1a}-0.17\%$
test_step_mdp_speed[False-True-False-True-True] 76.9640μs 45.5044μs 21.9759 KOps/s 22.2377 KOps/s $\color{#d91a1a}-1.18\%$
test_step_mdp_speed[False-True-False-True-False] 64.4730μs 28.6675μs 34.8827 KOps/s 35.1660 KOps/s $\color{#d91a1a}-0.81\%$
test_step_mdp_speed[False-True-False-False-True] 3.2636ms 29.0334μs 34.4431 KOps/s 35.4440 KOps/s $\color{#d91a1a}-2.82\%$
test_step_mdp_speed[False-True-False-False-False] 78.7750μs 18.0831μs 55.3001 KOps/s 55.4150 KOps/s $\color{#d91a1a}-0.21\%$
test_step_mdp_speed[False-False-True-True-True] 0.1234ms 47.9770μs 20.8433 KOps/s 21.4669 KOps/s $\color{#d91a1a}-2.90\%$
test_step_mdp_speed[False-False-True-True-False] 0.1472ms 30.2602μs 33.0467 KOps/s 32.8307 KOps/s $\color{#35bf28}+0.66\%$
test_step_mdp_speed[False-False-True-False-True] 53.0230μs 28.5803μs 34.9891 KOps/s 34.4007 KOps/s $\color{#35bf28}+1.71\%$
test_step_mdp_speed[False-False-True-False-False] 58.4530μs 18.1336μs 55.1462 KOps/s 55.5628 KOps/s $\color{#d91a1a}-0.75\%$
test_step_mdp_speed[False-False-False-True-True] 86.2750μs 48.3810μs 20.6693 KOps/s 20.6111 KOps/s $\color{#35bf28}+0.28\%$
test_step_mdp_speed[False-False-False-True-False] 64.9230μs 32.7638μs 30.5214 KOps/s 30.4353 KOps/s $\color{#35bf28}+0.28\%$
test_step_mdp_speed[False-False-False-False-True] 65.3330μs 30.4897μs 32.7980 KOps/s 33.0701 KOps/s $\color{#d91a1a}-0.82\%$
test_step_mdp_speed[False-False-False-False-False] 53.2530μs 20.0396μs 49.9012 KOps/s 49.6180 KOps/s $\color{#35bf28}+0.57\%$
test_values[generalized_advantage_estimate-True-True] 24.9257ms 24.3087ms 41.1375 Ops/s 41.5054 Ops/s $\color{#d91a1a}-0.89\%$
test_values[vec_generalized_advantage_estimate-True-True] 98.6990ms 2.8653ms 348.9983 Ops/s 327.5636 Ops/s $\textbf{\color{#35bf28}+6.54\%}$
test_values[td0_return_estimate-False-False] 0.1049ms 79.2542μs 12.6176 KOps/s 12.6840 KOps/s $\color{#d91a1a}-0.52\%$
test_values[td1_return_estimate-False-False] 56.8070ms 54.5326ms 18.3376 Ops/s 18.7059 Ops/s $\color{#d91a1a}-1.97\%$
test_values[vec_td1_return_estimate-False-False] 1.3872ms 1.0722ms 932.6780 Ops/s 935.3888 Ops/s $\color{#d91a1a}-0.29\%$
test_values[td_lambda_return_estimate-True-False] 89.8106ms 86.1022ms 11.6141 Ops/s 11.7996 Ops/s $\color{#d91a1a}-1.57\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3673ms 1.0714ms 933.3681 Ops/s 933.6269 Ops/s $\color{#d91a1a}-0.03\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.2283ms 24.0466ms 41.5860 Ops/s 42.3120 Ops/s $\color{#d91a1a}-1.72\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0263ms 0.7433ms 1.3453 KOps/s 1.3631 KOps/s $\color{#d91a1a}-1.31\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7458ms 0.6588ms 1.5179 KOps/s 1.5265 KOps/s $\color{#d91a1a}-0.56\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5104ms 1.4695ms 680.4823 Ops/s 684.0006 Ops/s $\color{#d91a1a}-0.51\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7844ms 0.6741ms 1.4834 KOps/s 1.4933 KOps/s $\color{#d91a1a}-0.66\%$
test_dqn_speed[False-None] 7.0689ms 1.4847ms 673.5592 Ops/s 683.6280 Ops/s $\color{#d91a1a}-1.47\%$
test_dqn_speed[False-backward] 2.1347ms 2.0937ms 477.6249 Ops/s 480.5626 Ops/s $\color{#d91a1a}-0.61\%$
test_dqn_speed[True-None] 0.6112ms 0.5145ms 1.9435 KOps/s 1.9002 KOps/s $\color{#35bf28}+2.28\%$
test_dqn_speed[True-backward] 1.2135ms 1.1750ms 851.0398 Ops/s 915.6046 Ops/s $\textbf{\color{#d91a1a}-7.05\%}$
test_dqn_speed[reduce-overhead-None] 0.6486ms 0.5575ms 1.7937 KOps/s 1.8566 KOps/s $\color{#d91a1a}-3.39\%$
test_dqn_speed[reduce-overhead-backward] 1.1006ms 1.0484ms 953.8369 Ops/s 1.0561 KOps/s $\textbf{\color{#d91a1a}-9.69\%}$
test_ddpg_speed[False-None] 3.2291ms 2.8887ms 346.1755 Ops/s 359.1201 Ops/s $\color{#d91a1a}-3.60\%$
test_ddpg_speed[False-backward] 4.5969ms 4.1663ms 240.0200 Ops/s 249.5196 Ops/s $\color{#d91a1a}-3.81\%$
test_ddpg_speed[True-None] 1.1080ms 1.0351ms 966.0576 Ops/s 957.4714 Ops/s $\color{#35bf28}+0.90\%$
test_ddpg_speed[True-backward] 2.2905ms 2.2454ms 445.3497 Ops/s 442.2659 Ops/s $\color{#35bf28}+0.70\%$
test_ddpg_speed[reduce-overhead-None] 1.1281ms 1.0468ms 955.2608 Ops/s 943.4811 Ops/s $\color{#35bf28}+1.25\%$
test_ddpg_speed[reduce-overhead-backward] 1.7882ms 1.7269ms 579.0756 Ops/s 570.1267 Ops/s $\color{#35bf28}+1.57\%$
test_sac_speed[False-None] 8.3114ms 7.9264ms 126.1608 Ops/s 126.6756 Ops/s $\color{#d91a1a}-0.41\%$
test_sac_speed[False-backward] 11.5711ms 11.1384ms 89.7791 Ops/s 89.3180 Ops/s $\color{#35bf28}+0.52\%$
test_sac_speed[True-None] 1.7137ms 1.5610ms 640.6046 Ops/s 668.6644 Ops/s $\color{#d91a1a}-4.20\%$
test_sac_speed[True-backward] 3.4378ms 3.3577ms 297.8199 Ops/s 301.0422 Ops/s $\color{#d91a1a}-1.07\%$
test_sac_speed[reduce-overhead-None] 22.5791ms 12.5188ms 79.8799 Ops/s 81.1183 Ops/s $\color{#d91a1a}-1.53\%$
test_sac_speed[reduce-overhead-backward] 1.6094ms 1.4907ms 670.8472 Ops/s 662.7695 Ops/s $\color{#35bf28}+1.22\%$
test_redq_speed[False-None] 8.0989ms 7.3597ms 135.8758 Ops/s 134.5564 Ops/s $\color{#35bf28}+0.98\%$
test_redq_speed[False-backward] 12.8476ms 11.6599ms 85.7642 Ops/s 85.6898 Ops/s $\color{#35bf28}+0.09\%$
test_redq_speed[True-None] 1.9933ms 1.9421ms 514.9009 Ops/s 508.0090 Ops/s $\color{#35bf28}+1.36\%$
test_redq_speed[True-backward] 4.2030ms 3.8006ms 263.1165 Ops/s 276.8296 Ops/s $\color{#d91a1a}-4.95\%$
test_redq_speed[reduce-overhead-None] 2.0241ms 1.9436ms 514.5066 Ops/s 505.1045 Ops/s $\color{#35bf28}+1.86\%$
test_redq_speed[reduce-overhead-backward] 4.2565ms 3.7928ms 263.6581 Ops/s 277.2018 Ops/s $\color{#d91a1a}-4.89\%$
test_redq_deprec_speed[False-None] 9.4690ms 8.8870ms 112.5245 Ops/s 111.4934 Ops/s $\color{#35bf28}+0.92\%$
test_redq_deprec_speed[False-backward] 12.6042ms 12.1939ms 82.0084 Ops/s 82.6917 Ops/s $\color{#d91a1a}-0.83\%$
test_redq_deprec_speed[True-None] 2.4674ms 2.3574ms 424.1935 Ops/s 429.8752 Ops/s $\color{#d91a1a}-1.32\%$
test_redq_deprec_speed[True-backward] 4.1256ms 3.9254ms 254.7502 Ops/s 249.3675 Ops/s $\color{#35bf28}+2.16\%$
test_redq_deprec_speed[reduce-overhead-None] 2.3524ms 2.2756ms 439.4471 Ops/s 441.0421 Ops/s $\color{#d91a1a}-0.36\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.0026ms 3.9095ms 255.7859 Ops/s 255.3170 Ops/s $\color{#35bf28}+0.18\%$
test_td3_speed[False-None] 8.0759ms 7.8377ms 127.5881 Ops/s 128.2841 Ops/s $\color{#d91a1a}-0.54\%$
test_td3_speed[False-backward] 10.7075ms 10.1841ms 98.1928 Ops/s 98.4241 Ops/s $\color{#d91a1a}-0.24\%$
test_td3_speed[True-None] 1.5568ms 1.5294ms 653.8409 Ops/s 656.2679 Ops/s $\color{#d91a1a}-0.37\%$
test_td3_speed[True-backward] 3.2458ms 3.0743ms 325.2734 Ops/s 324.9282 Ops/s $\color{#35bf28}+0.11\%$
test_td3_speed[reduce-overhead-None] 81.7218ms 25.6649ms 38.9637 Ops/s 38.1164 Ops/s $\color{#35bf28}+2.22\%$
test_td3_speed[reduce-overhead-backward] 1.3760ms 1.2890ms 775.8098 Ops/s 778.4076 Ops/s $\color{#d91a1a}-0.33\%$
test_cql_speed[False-None] 17.0633ms 16.4123ms 60.9298 Ops/s 60.0920 Ops/s $\color{#35bf28}+1.39\%$
test_cql_speed[False-backward] 22.2412ms 21.6143ms 46.2656 Ops/s 45.4050 Ops/s $\color{#35bf28}+1.90\%$
test_cql_speed[True-None] 3.1413ms 2.8675ms 348.7350 Ops/s 326.5470 Ops/s $\textbf{\color{#35bf28}+6.79\%}$
test_cql_speed[True-backward] 5.0257ms 4.9662ms 201.3598 Ops/s 198.3760 Ops/s $\color{#35bf28}+1.50\%$
test_cql_speed[reduce-overhead-None] 21.4601ms 13.1326ms 76.1465 Ops/s 77.2377 Ops/s $\color{#d91a1a}-1.41\%$
test_cql_speed[reduce-overhead-backward] 1.5404ms 1.4859ms 673.0154 Ops/s 593.1254 Ops/s $\textbf{\color{#35bf28}+13.47\%}$
test_a2c_speed[False-None] 3.3918ms 3.2081ms 311.7114 Ops/s 316.3394 Ops/s $\color{#d91a1a}-1.46\%$
test_a2c_speed[False-backward] 6.7466ms 6.1056ms 163.7844 Ops/s 156.5542 Ops/s $\color{#35bf28}+4.62\%$
test_a2c_speed[True-None] 1.2158ms 1.0229ms 977.6130 Ops/s 990.1853 Ops/s $\color{#d91a1a}-1.27\%$
test_a2c_speed[True-backward] 2.5822ms 2.5323ms 394.8963 Ops/s 366.5322 Ops/s $\textbf{\color{#35bf28}+7.74\%}$
test_a2c_speed[reduce-overhead-None] 21.8304ms 11.5974ms 86.2261 Ops/s 86.7661 Ops/s $\color{#d91a1a}-0.62\%$
test_a2c_speed[reduce-overhead-backward] 1.0198ms 0.9601ms 1.0416 KOps/s 885.7739 Ops/s $\textbf{\color{#35bf28}+17.59\%}$
test_ppo_speed[False-None] 3.7232ms 3.6385ms 274.8352 Ops/s 275.9827 Ops/s $\color{#d91a1a}-0.42\%$
test_ppo_speed[False-backward] 7.2683ms 6.8176ms 146.6783 Ops/s 142.6293 Ops/s $\color{#35bf28}+2.84\%$
test_ppo_speed[True-None] 1.0465ms 0.9549ms 1.0473 KOps/s 1.0369 KOps/s $\color{#35bf28}+1.00\%$
test_ppo_speed[True-backward] 2.5312ms 2.4871ms 402.0679 Ops/s 374.9458 Ops/s $\textbf{\color{#35bf28}+7.23\%}$
test_ppo_speed[reduce-overhead-None] 0.5698ms 0.4951ms 2.0200 KOps/s 1.9282 KOps/s $\color{#35bf28}+4.76\%$
test_ppo_speed[reduce-overhead-backward] 1.0730ms 1.0378ms 963.5712 Ops/s 886.2980 Ops/s $\textbf{\color{#35bf28}+8.72\%}$
test_reinforce_speed[False-None] 2.6036ms 2.2403ms 446.3787 Ops/s 449.4928 Ops/s $\color{#d91a1a}-0.69\%$
test_reinforce_speed[False-backward] 3.4589ms 3.3894ms 295.0353 Ops/s 300.9422 Ops/s $\color{#d91a1a}-1.96\%$
test_reinforce_speed[True-None] 1.1922ms 0.8041ms 1.2436 KOps/s 1.2262 KOps/s $\color{#35bf28}+1.42\%$
test_reinforce_speed[True-backward] 2.5793ms 2.4958ms 400.6786 Ops/s 394.0887 Ops/s $\color{#35bf28}+1.67\%$
test_reinforce_speed[reduce-overhead-None] 22.1014ms 11.7082ms 85.4105 Ops/s 88.4694 Ops/s $\color{#d91a1a}-3.46\%$
test_reinforce_speed[reduce-overhead-backward] 1.0608ms 1.0215ms 978.9175 Ops/s 839.6005 Ops/s $\textbf{\color{#35bf28}+16.59\%}$
test_iql_speed[False-None] 9.6317ms 9.1551ms 109.2287 Ops/s 110.8339 Ops/s $\color{#d91a1a}-1.45\%$
test_iql_speed[False-backward] 13.3934ms 12.8583ms 77.7710 Ops/s 76.6441 Ops/s $\color{#35bf28}+1.47\%$
test_iql_speed[True-None] 1.7813ms 1.7074ms 585.6972 Ops/s 570.8759 Ops/s $\color{#35bf28}+2.60\%$
test_iql_speed[True-backward] 4.3279ms 4.1638ms 240.1648 Ops/s 230.6175 Ops/s $\color{#35bf28}+4.14\%$
test_iql_speed[reduce-overhead-None] 20.1830ms 11.5071ms 86.9032 Ops/s 88.7017 Ops/s $\color{#d91a1a}-2.03\%$
test_iql_speed[reduce-overhead-backward] 1.6366ms 1.5853ms 630.7765 Ops/s 706.5655 Ops/s $\textbf{\color{#d91a1a}-10.73\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.6346ms 6.2365ms 160.3455 Ops/s 159.0825 Ops/s $\color{#35bf28}+0.79\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6420ms 0.3372ms 2.9655 KOps/s 2.8318 KOps/s $\color{#35bf28}+4.72\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6914ms 0.2953ms 3.3863 KOps/s 3.6413 KOps/s $\textbf{\color{#d91a1a}-7.00\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.2036ms 5.9750ms 167.3640 Ops/s 164.6980 Ops/s $\color{#35bf28}+1.62\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7532ms 0.3132ms 3.1927 KOps/s 2.7621 KOps/s $\textbf{\color{#35bf28}+15.59\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4787ms 0.2694ms 3.7113 KOps/s 3.0853 KOps/s $\textbf{\color{#35bf28}+20.29\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6301ms 1.2522ms 798.6140 Ops/s 699.3075 Ops/s $\textbf{\color{#35bf28}+14.20\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.4571ms 1.2052ms 829.7517 Ops/s 727.4419 Ops/s $\textbf{\color{#35bf28}+14.06\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.2378ms 6.1174ms 163.4687 Ops/s 161.9832 Ops/s $\color{#35bf28}+0.92\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8721ms 0.4117ms 2.4288 KOps/s 1.8572 KOps/s $\textbf{\color{#35bf28}+30.77\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6158ms 0.4311ms 2.3199 KOps/s 1.9606 KOps/s $\textbf{\color{#35bf28}+18.33\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.0424ms 5.9575ms 167.8568 Ops/s 165.7552 Ops/s $\color{#35bf28}+1.27\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9156ms 0.3711ms 2.6949 KOps/s 3.6064 KOps/s $\textbf{\color{#d91a1a}-25.28\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6590ms 0.3099ms 3.2272 KOps/s 3.8806 KOps/s $\textbf{\color{#d91a1a}-16.84\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.2410ms 5.9743ms 167.3845 Ops/s 167.7796 Ops/s $\color{#d91a1a}-0.24\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7240ms 0.3606ms 2.7729 KOps/s 3.1992 KOps/s $\textbf{\color{#d91a1a}-13.32\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5046ms 0.3268ms 3.0603 KOps/s 3.0357 KOps/s $\color{#35bf28}+0.81\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.2748ms 6.1373ms 162.9369 Ops/s 163.2043 Ops/s $\color{#d91a1a}-0.16\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0557ms 0.4981ms 2.0078 KOps/s 2.2662 KOps/s $\textbf{\color{#d91a1a}-11.40\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7363ms 0.4938ms 2.0250 KOps/s 2.5279 KOps/s $\textbf{\color{#d91a1a}-19.89\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.8750ms 5.2104ms 191.9253 Ops/s 192.7217 Ops/s $\color{#d91a1a}-0.41\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 6.2661ms 2.0069ms 498.2770 Ops/s 442.9163 Ops/s $\textbf{\color{#35bf28}+12.50\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 9.1886ms 1.2644ms 790.8852 Ops/s 871.3771 Ops/s $\textbf{\color{#d91a1a}-9.24\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.6975ms 5.2738ms 189.6171 Ops/s 194.7789 Ops/s $\color{#d91a1a}-2.65\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 7.5577ms 2.0125ms 496.8896 Ops/s 433.7965 Ops/s $\textbf{\color{#35bf28}+14.54\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 9.0197ms 1.2839ms 778.8777 Ops/s 856.9863 Ops/s $\textbf{\color{#d91a1a}-9.11\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5176s 15.7228ms 63.6018 Ops/s 33.0457 Ops/s $\textbf{\color{#35bf28}+92.47\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 9.7576ms 2.2036ms 453.8101 Ops/s 533.6195 Ops/s $\textbf{\color{#d91a1a}-14.96\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.2688ms 1.3645ms 732.8863 Ops/s 808.1632 Ops/s $\textbf{\color{#d91a1a}-9.31\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.6787ms 13.2254ms 75.6120 Ops/s 75.5542 Ops/s $\color{#35bf28}+0.08\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 20.2624ms 17.5747ms 56.9000 Ops/s 57.1786 Ops/s $\color{#d91a1a}-0.49\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 17.9189ms 17.5307ms 57.0428 Ops/s 55.6989 Ops/s $\color{#35bf28}+2.41\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.8187ms 17.3724ms 57.5625 Ops/s 58.1852 Ops/s $\color{#d91a1a}-1.07\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 17.6987ms 17.2508ms 57.9682 Ops/s 55.5976 Ops/s $\color{#35bf28}+4.26\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 20.4441ms 18.6151ms 53.7198 Ops/s 53.6462 Ops/s $\color{#35bf28}+0.14\%$

vmoens added a commit that referenced this pull request Dec 14, 2024
ghstack-source-id: 84267fc499950c43714db823accde31fa708e693
Pull Request resolved: #2649
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 14, 2024
ghstack-source-id: 186d958ca0653d25783dd80b73897fb3e10c78d4
Pull Request resolved: #2649
@vmoens vmoens added the enhancement New feature or request label Dec 14, 2024
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 14, 2024
ghstack-source-id: c319d4ae38bc10f4821ad91a31b53e311a2df5d1
Pull Request resolved: #2649
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 14, 2024
ghstack-source-id: 6aaf691ffc19f0a415d21adbdf8b077f8e04cb22
Pull Request resolved: #2649
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 14, 2024
ghstack-source-id: d9d056e75edd8ef8cf9497f95b91e04400bec348
Pull Request resolved: #2649
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 14, 2024
ghstack-source-id: 8e0b064341e5de5d317a09c0eccba140b6a0e078
Pull Request resolved: #2649
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: 29e638710dcb9bf2d84196e23df7954be2210053
Pull Request resolved: #2649
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens merged commit b73eea2 into gh/vmoens/52/base Dec 16, 2024
70 of 78 checks passed
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: 77bca166701d28dd69ef3964f55ab4f3e4b17fed
Pull Request resolved: #2649
@vmoens vmoens deleted the gh/vmoens/52/head branch December 16, 2024 01:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants