Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Log pbar rate in SOTA implementations #2662

Merged
merged 9 commits into from
Dec 18, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 17, 2024

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 17, 2024
ghstack-source-id: 03106e831ef158d13ab87be6230c2d512f2401b6
Pull Request resolved: #2662
Copy link

pytorch-bot bot commented Dec 17, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2662

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (8 Unrelated Failures)

As of commit ace76f3 with merge base 91064bc (image):

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 17, 2024
Copy link

github-actions bot commented Dec 17, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}17$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4324s 0.4293s 2.3293 Ops/s 2.1620 Ops/s $\textbf{\color{#35bf28}+7.74\%}$
test_transformed 0.6113s 0.6101s 1.6391 Ops/s 1.6227 Ops/s $\color{#35bf28}+1.01\%$
test_serial 1.3650s 1.3633s 0.7335 Ops/s 0.7327 Ops/s $\color{#35bf28}+0.11\%$
test_parallel 1.4046s 1.3249s 0.7548 Ops/s 0.7465 Ops/s $\color{#35bf28}+1.11\%$
test_step_mdp_speed[True-True-True-True-True] 0.2347ms 31.0528μs 32.2032 KOps/s 32.2985 KOps/s $\color{#d91a1a}-0.29\%$
test_step_mdp_speed[True-True-True-True-False] 49.4520μs 18.0546μs 55.3875 KOps/s 55.4220 KOps/s $\color{#d91a1a}-0.06\%$
test_step_mdp_speed[True-True-True-False-True] 48.0490μs 17.5756μs 56.8971 KOps/s 58.0843 KOps/s $\color{#d91a1a}-2.04\%$
test_step_mdp_speed[True-True-True-False-False] 33.8730μs 10.1864μs 98.1698 KOps/s 99.1324 KOps/s $\color{#d91a1a}-0.97\%$
test_step_mdp_speed[True-True-False-True-True] 82.7240μs 32.7477μs 30.5365 KOps/s 30.1920 KOps/s $\color{#35bf28}+1.14\%$
test_step_mdp_speed[True-True-False-True-False] 62.0460μs 20.0678μs 49.8310 KOps/s 49.9339 KOps/s $\color{#d91a1a}-0.21\%$
test_step_mdp_speed[True-True-False-False-True] 52.6380μs 19.5912μs 51.0434 KOps/s 51.1138 KOps/s $\color{#d91a1a}-0.14\%$
test_step_mdp_speed[True-True-False-False-False] 36.4070μs 12.2189μs 81.8401 KOps/s 82.7315 KOps/s $\color{#d91a1a}-1.08\%$
test_step_mdp_speed[True-False-True-True-True] 65.2610μs 35.1478μs 28.4513 KOps/s 28.3043 KOps/s $\color{#35bf28}+0.52\%$
test_step_mdp_speed[True-False-True-True-False] 68.5250μs 22.0597μs 45.3314 KOps/s 44.6035 KOps/s $\color{#35bf28}+1.63\%$
test_step_mdp_speed[True-False-True-False-True] 51.9470μs 19.7329μs 50.6768 KOps/s 51.2762 KOps/s $\color{#d91a1a}-1.17\%$
test_step_mdp_speed[True-False-True-False-False] 46.9070μs 12.2966μs 81.3234 KOps/s 83.0176 KOps/s $\color{#d91a1a}-2.04\%$
test_step_mdp_speed[True-False-False-True-True] 73.0370μs 36.6666μs 27.2728 KOps/s 26.6960 KOps/s $\color{#35bf28}+2.16\%$
test_step_mdp_speed[True-False-False-True-False] 68.5070μs 23.7021μs 42.1903 KOps/s 41.0257 KOps/s $\color{#35bf28}+2.84\%$
test_step_mdp_speed[True-False-False-False-True] 54.5120μs 21.3683μs 46.7983 KOps/s 46.6873 KOps/s $\color{#35bf28}+0.24\%$
test_step_mdp_speed[True-False-False-False-False] 51.5760μs 14.0343μs 71.2541 KOps/s 70.8365 KOps/s $\color{#35bf28}+0.59\%$
test_step_mdp_speed[False-True-True-True-True] 0.1063ms 34.6090μs 28.8942 KOps/s 28.3880 KOps/s $\color{#35bf28}+1.78\%$
test_step_mdp_speed[False-True-True-True-False] 49.6230μs 21.9288μs 45.6022 KOps/s 45.0602 KOps/s $\color{#35bf28}+1.20\%$
test_step_mdp_speed[False-True-True-False-True] 53.7700μs 22.4351μs 44.5730 KOps/s 45.1046 KOps/s $\color{#d91a1a}-1.18\%$
test_step_mdp_speed[False-True-True-False-False] 60.8130μs 13.6345μs 73.3436 KOps/s 73.9438 KOps/s $\color{#d91a1a}-0.81\%$
test_step_mdp_speed[False-True-False-True-True] 82.9750μs 36.9430μs 27.0687 KOps/s 26.6314 KOps/s $\color{#35bf28}+1.64\%$
test_step_mdp_speed[False-True-False-True-False] 67.7660μs 23.7785μs 42.0549 KOps/s 41.3006 KOps/s $\color{#35bf28}+1.83\%$
test_step_mdp_speed[False-True-False-False-True] 2.8066ms 24.4350μs 40.9250 KOps/s 41.2482 KOps/s $\color{#d91a1a}-0.78\%$
test_step_mdp_speed[False-True-False-False-False] 70.6320μs 15.4394μs 64.7693 KOps/s 64.3134 KOps/s $\color{#35bf28}+0.71\%$
test_step_mdp_speed[False-False-True-True-True] 82.8350μs 39.5520μs 25.2832 KOps/s 25.5099 KOps/s $\color{#d91a1a}-0.89\%$
test_step_mdp_speed[False-False-True-True-False] 72.0540μs 26.2474μs 38.0990 KOps/s 38.3377 KOps/s $\color{#d91a1a}-0.62\%$
test_step_mdp_speed[False-False-True-False-True] 73.0660μs 24.0799μs 41.5285 KOps/s 41.6964 KOps/s $\color{#d91a1a}-0.40\%$
test_step_mdp_speed[False-False-True-False-False] 44.7340μs 15.4817μs 64.5924 KOps/s 64.0709 KOps/s $\color{#35bf28}+0.81\%$
test_step_mdp_speed[False-False-False-True-True] 85.7490μs 40.5188μs 24.6799 KOps/s 24.6106 KOps/s $\color{#35bf28}+0.28\%$
test_step_mdp_speed[False-False-False-True-False] 64.1100μs 27.4128μs 36.4793 KOps/s 35.9790 KOps/s $\color{#35bf28}+1.39\%$
test_step_mdp_speed[False-False-False-False-True] 61.9250μs 25.3156μs 39.5013 KOps/s 38.8087 KOps/s $\color{#35bf28}+1.78\%$
test_step_mdp_speed[False-False-False-False-False] 47.3680μs 17.1571μs 58.2849 KOps/s 58.1062 KOps/s $\color{#35bf28}+0.31\%$
test_values[generalized_advantage_estimate-True-True] 12.4421ms 9.7643ms 102.4143 Ops/s 105.1945 Ops/s $\color{#d91a1a}-2.64\%$
test_values[vec_generalized_advantage_estimate-True-True] 59.9401ms 36.9166ms 27.0881 Ops/s 27.7812 Ops/s $\color{#d91a1a}-2.49\%$
test_values[td0_return_estimate-False-False] 0.2495ms 0.1791ms 5.5842 KOps/s 5.1586 KOps/s $\textbf{\color{#35bf28}+8.25\%}$
test_values[td1_return_estimate-False-False] 25.8362ms 24.4701ms 40.8662 Ops/s 40.6322 Ops/s $\color{#35bf28}+0.58\%$
test_values[vec_td1_return_estimate-False-False] 38.5561ms 36.3748ms 27.4916 Ops/s 27.5383 Ops/s $\color{#d91a1a}-0.17\%$
test_values[td_lambda_return_estimate-True-False] 54.4341ms 35.6805ms 28.0265 Ops/s 28.3167 Ops/s $\color{#d91a1a}-1.02\%$
test_values[vec_td_lambda_return_estimate-True-False] 38.4394ms 36.4887ms 27.4057 Ops/s 27.2436 Ops/s $\color{#35bf28}+0.60\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 12.1948ms 8.4503ms 118.3390 Ops/s 118.6479 Ops/s $\color{#d91a1a}-0.26\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.2949ms 1.9582ms 510.6687 Ops/s 429.6548 Ops/s $\textbf{\color{#35bf28}+18.86\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4364ms 0.3624ms 2.7593 KOps/s 2.7765 KOps/s $\color{#d91a1a}-0.62\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 50.1072ms 47.5279ms 21.0403 Ops/s 21.9118 Ops/s $\color{#d91a1a}-3.98\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.9236ms 3.0545ms 327.3835 Ops/s 325.7128 Ops/s $\color{#35bf28}+0.51\%$
test_dqn_speed[False-None] 5.7875ms 1.3891ms 719.8840 Ops/s 707.0442 Ops/s $\color{#35bf28}+1.82\%$
test_dqn_speed[False-backward] 1.9573ms 1.8877ms 529.7398 Ops/s 522.8156 Ops/s $\color{#35bf28}+1.32\%$
test_dqn_speed[True-None] 1.2233ms 0.4979ms 2.0085 KOps/s 2.0542 KOps/s $\color{#d91a1a}-2.22\%$
test_dqn_speed[True-backward] 0.9649ms 0.9027ms 1.1078 KOps/s 994.8735 Ops/s $\textbf{\color{#35bf28}+11.35\%}$
test_dqn_speed[reduce-overhead-None] 0.7322ms 0.4866ms 2.0551 KOps/s 2.0606 KOps/s $\color{#d91a1a}-0.27\%$
test_dqn_speed[reduce-overhead-backward] 1.0356ms 0.9116ms 1.0969 KOps/s 1.0834 KOps/s $\color{#35bf28}+1.25\%$
test_ddpg_speed[False-None] 3.9186ms 2.9177ms 342.7366 Ops/s 342.9341 Ops/s $\color{#d91a1a}-0.06\%$
test_ddpg_speed[False-backward] 4.5120ms 4.0666ms 245.9075 Ops/s 240.2096 Ops/s $\color{#35bf28}+2.37\%$
test_ddpg_speed[True-None] 1.2180ms 1.0232ms 977.3379 Ops/s 968.3740 Ops/s $\color{#35bf28}+0.93\%$
test_ddpg_speed[True-backward] 1.9893ms 1.9193ms 521.0343 Ops/s 516.7560 Ops/s $\color{#35bf28}+0.83\%$
test_ddpg_speed[reduce-overhead-None] 1.1890ms 1.0201ms 980.2742 Ops/s 974.3672 Ops/s $\color{#35bf28}+0.61\%$
test_ddpg_speed[reduce-overhead-backward] 2.0102ms 1.9310ms 517.8568 Ops/s 502.8947 Ops/s $\color{#35bf28}+2.98\%$
test_sac_speed[False-None] 10.4453ms 8.1941ms 122.0394 Ops/s 113.7761 Ops/s $\textbf{\color{#35bf28}+7.26\%}$
test_sac_speed[False-backward] 12.6215ms 10.9652ms 91.1977 Ops/s 83.1179 Ops/s $\textbf{\color{#35bf28}+9.72\%}$
test_sac_speed[True-None] 3.6816ms 1.8621ms 537.0240 Ops/s 519.6137 Ops/s $\color{#35bf28}+3.35\%$
test_sac_speed[True-backward] 4.3619ms 3.8504ms 259.7111 Ops/s 256.9949 Ops/s $\color{#35bf28}+1.06\%$
test_sac_speed[reduce-overhead-None] 3.2418ms 1.8875ms 529.8033 Ops/s 525.2958 Ops/s $\color{#35bf28}+0.86\%$
test_sac_speed[reduce-overhead-backward] 3.9504ms 3.5945ms 278.2013 Ops/s 276.2103 Ops/s $\color{#35bf28}+0.72\%$
test_redq_speed[False-None] 28.8282ms 13.9430ms 71.7204 Ops/s 75.7244 Ops/s $\textbf{\color{#d91a1a}-5.29\%}$
test_redq_speed[False-backward] 23.6476ms 22.5301ms 44.3850 Ops/s 43.8676 Ops/s $\color{#35bf28}+1.18\%$
test_redq_speed[True-None] 6.0123ms 4.7275ms 211.5288 Ops/s 201.5603 Ops/s $\color{#35bf28}+4.95\%$
test_redq_speed[True-backward] 13.3260ms 13.0024ms 76.9088 Ops/s 76.7778 Ops/s $\color{#35bf28}+0.17\%$
test_redq_speed[reduce-overhead-None] 5.4931ms 4.7351ms 211.1867 Ops/s 201.9657 Ops/s $\color{#35bf28}+4.57\%$
test_redq_speed[reduce-overhead-backward] 13.5954ms 12.4934ms 80.0425 Ops/s 78.8419 Ops/s $\color{#35bf28}+1.52\%$
test_redq_deprec_speed[False-None] 15.7899ms 13.8477ms 72.2143 Ops/s 73.3966 Ops/s $\color{#d91a1a}-1.61\%$
test_redq_deprec_speed[False-backward] 21.5719ms 19.5791ms 51.0750 Ops/s 48.8970 Ops/s $\color{#35bf28}+4.45\%$
test_redq_deprec_speed[True-None] 5.4536ms 4.1102ms 243.2995 Ops/s 247.9099 Ops/s $\color{#d91a1a}-1.86\%$
test_redq_deprec_speed[True-backward] 12.5848ms 8.2783ms 120.7976 Ops/s 112.1758 Ops/s $\textbf{\color{#35bf28}+7.69\%}$
test_redq_deprec_speed[reduce-overhead-None] 5.8742ms 3.6176ms 276.4235 Ops/s 244.1530 Ops/s $\textbf{\color{#35bf28}+13.22\%}$
test_redq_deprec_speed[reduce-overhead-backward] 9.1529ms 8.5211ms 117.3551 Ops/s 107.8494 Ops/s $\textbf{\color{#35bf28}+8.81\%}$
test_td3_speed[False-None] 9.2748ms 8.1714ms 122.3777 Ops/s 113.0590 Ops/s $\textbf{\color{#35bf28}+8.24\%}$
test_td3_speed[False-backward] 11.9209ms 10.5250ms 95.0121 Ops/s 91.6785 Ops/s $\color{#35bf28}+3.64\%$
test_td3_speed[True-None] 1.9654ms 1.7284ms 578.5804 Ops/s 555.8605 Ops/s $\color{#35bf28}+4.09\%$
test_td3_speed[True-backward] 4.1309ms 3.6581ms 273.3624 Ops/s 289.4313 Ops/s $\textbf{\color{#d91a1a}-5.55\%}$
test_td3_speed[reduce-overhead-None] 1.9573ms 1.7218ms 580.7869 Ops/s 568.5314 Ops/s $\color{#35bf28}+2.16\%$
test_td3_speed[reduce-overhead-backward] 3.4385ms 3.3470ms 298.7789 Ops/s 284.5991 Ops/s $\color{#35bf28}+4.98\%$
test_cql_speed[False-None] 39.3123ms 36.5190ms 27.3830 Ops/s 26.2724 Ops/s $\color{#35bf28}+4.23\%$
test_cql_speed[False-backward] 50.7013ms 47.6883ms 20.9695 Ops/s 20.3801 Ops/s $\color{#35bf28}+2.89\%$
test_cql_speed[True-None] 17.0582ms 15.9495ms 62.6979 Ops/s 61.1930 Ops/s $\color{#35bf28}+2.46\%$
test_cql_speed[True-backward] 23.9432ms 22.7887ms 43.8814 Ops/s 42.4367 Ops/s $\color{#35bf28}+3.40\%$
test_cql_speed[reduce-overhead-None] 17.2383ms 16.0639ms 62.2515 Ops/s 62.4915 Ops/s $\color{#d91a1a}-0.38\%$
test_cql_speed[reduce-overhead-backward] 23.8088ms 22.6968ms 44.0590 Ops/s 41.9436 Ops/s $\textbf{\color{#35bf28}+5.04\%}$
test_a2c_speed[False-None] 8.4206ms 7.2700ms 137.5508 Ops/s 132.0575 Ops/s $\color{#35bf28}+4.16\%$
test_a2c_speed[False-backward] 15.5521ms 14.8103ms 67.5208 Ops/s 66.3193 Ops/s $\color{#35bf28}+1.81\%$
test_a2c_speed[True-None] 4.9490ms 4.2712ms 234.1256 Ops/s 227.2961 Ops/s $\color{#35bf28}+3.00\%$
test_a2c_speed[True-backward] 11.4321ms 11.0255ms 90.6989 Ops/s 85.6782 Ops/s $\textbf{\color{#35bf28}+5.86\%}$
test_a2c_speed[reduce-overhead-None] 5.0150ms 4.4159ms 226.4528 Ops/s 228.1733 Ops/s $\color{#d91a1a}-0.75\%$
test_a2c_speed[reduce-overhead-backward] 12.5016ms 11.0297ms 90.6646 Ops/s 91.1014 Ops/s $\color{#d91a1a}-0.48\%$
test_ppo_speed[False-None] 7.8943ms 7.5873ms 131.7991 Ops/s 131.2376 Ops/s $\color{#35bf28}+0.43\%$
test_ppo_speed[False-backward] 15.8510ms 15.3349ms 65.2107 Ops/s 64.4997 Ops/s $\color{#35bf28}+1.10\%$
test_ppo_speed[True-None] 4.9232ms 3.8832ms 257.5168 Ops/s 256.4833 Ops/s $\color{#35bf28}+0.40\%$
test_ppo_speed[True-backward] 10.7983ms 9.7980ms 102.0614 Ops/s 97.4842 Ops/s $\color{#35bf28}+4.70\%$
test_ppo_speed[reduce-overhead-None] 4.4211ms 3.7894ms 263.8915 Ops/s 264.9795 Ops/s $\color{#d91a1a}-0.41\%$
test_ppo_speed[reduce-overhead-backward] 10.3186ms 9.9756ms 100.2449 Ops/s 100.9933 Ops/s $\color{#d91a1a}-0.74\%$
test_reinforce_speed[False-None] 8.1687ms 6.7567ms 148.0005 Ops/s 147.1020 Ops/s $\color{#35bf28}+0.61\%$
test_reinforce_speed[False-backward] 11.5096ms 10.4146ms 96.0191 Ops/s 98.2374 Ops/s $\color{#d91a1a}-2.26\%$
test_reinforce_speed[True-None] 3.0274ms 2.7744ms 360.4411 Ops/s 357.9590 Ops/s $\color{#35bf28}+0.69\%$
test_reinforce_speed[True-backward] 9.3186ms 8.6843ms 115.1503 Ops/s 112.0177 Ops/s $\color{#35bf28}+2.80\%$
test_reinforce_speed[reduce-overhead-None] 3.0521ms 2.6653ms 375.1860 Ops/s 362.2541 Ops/s $\color{#35bf28}+3.57\%$
test_reinforce_speed[reduce-overhead-backward] 9.4471ms 8.6108ms 116.1331 Ops/s 105.1058 Ops/s $\textbf{\color{#35bf28}+10.49\%}$
test_iql_speed[False-None] 35.6603ms 32.7750ms 30.5111 Ops/s 29.1292 Ops/s $\color{#35bf28}+4.74\%$
test_iql_speed[False-backward] 53.3404ms 45.9430ms 21.7661 Ops/s 21.2032 Ops/s $\color{#35bf28}+2.65\%$
test_iql_speed[True-None] 12.5349ms 11.0457ms 90.5327 Ops/s 88.5947 Ops/s $\color{#35bf28}+2.19\%$
test_iql_speed[True-backward] 23.6993ms 22.6112ms 44.2259 Ops/s 43.9365 Ops/s $\color{#35bf28}+0.66\%$
test_iql_speed[reduce-overhead-None] 12.0859ms 10.9379ms 91.4250 Ops/s 90.9463 Ops/s $\color{#35bf28}+0.53\%$
test_iql_speed[reduce-overhead-backward] 23.6261ms 21.9400ms 45.5789 Ops/s 43.3449 Ops/s $\textbf{\color{#35bf28}+5.15\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.2160ms 5.1511ms 194.1351 Ops/s 194.1613 Ops/s $\color{#d91a1a}-0.01\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9410ms 0.5123ms 1.9518 KOps/s 1.9520 KOps/s $-0.01\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8437ms 0.4881ms 2.0488 KOps/s 2.0473 KOps/s $\color{#35bf28}+0.07\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.6570ms 4.9532ms 201.8917 Ops/s 209.2999 Ops/s $\color{#d91a1a}-3.54\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.9441ms 0.4950ms 2.0200 KOps/s 1.9828 KOps/s $\color{#35bf28}+1.88\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7102ms 0.4642ms 2.1541 KOps/s 2.1137 KOps/s $\color{#35bf28}+1.91\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.2067ms 1.6219ms 616.5441 Ops/s 581.3704 Ops/s $\textbf{\color{#35bf28}+6.05\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.2987ms 1.5911ms 628.5141 Ops/s 626.2949 Ops/s $\color{#35bf28}+0.35\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.7455ms 5.0647ms 197.4456 Ops/s 201.9768 Ops/s $\color{#d91a1a}-2.24\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.3598ms 0.6494ms 1.5400 KOps/s 1.5277 KOps/s $\color{#35bf28}+0.81\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8532ms 0.6125ms 1.6326 KOps/s 1.6144 KOps/s $\color{#35bf28}+1.13\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.1848ms 4.8990ms 204.1221 Ops/s 209.1303 Ops/s $\color{#d91a1a}-2.39\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.5426ms 0.5125ms 1.9513 KOps/s 1.9181 KOps/s $\color{#35bf28}+1.74\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8845ms 0.4830ms 2.0706 KOps/s 2.0635 KOps/s $\color{#35bf28}+0.34\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.3297ms 4.8805ms 204.8990 Ops/s 205.2187 Ops/s $\color{#d91a1a}-0.16\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.3849ms 0.4902ms 2.0400 KOps/s 1.9306 KOps/s $\textbf{\color{#35bf28}+5.67\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7818ms 0.4735ms 2.1117 KOps/s 2.1127 KOps/s $\color{#d91a1a}-0.04\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.2621ms 5.0198ms 199.2104 Ops/s 201.5384 Ops/s $\color{#d91a1a}-1.16\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.2900ms 0.6496ms 1.5395 KOps/s 1.5193 KOps/s $\color{#35bf28}+1.33\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8327ms 0.6126ms 1.6323 KOps/s 1.5746 KOps/s $\color{#35bf28}+3.66\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.2961ms 4.3773ms 228.4494 Ops/s 38.4854 Ops/s $\textbf{\color{#35bf28}+493.60\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.1331ms 2.3865ms 419.0280 Ops/s 411.1101 Ops/s $\color{#35bf28}+1.93\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 5.3545ms 1.3727ms 728.5050 Ops/s 758.0527 Ops/s $\color{#d91a1a}-3.90\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4409s 13.1646ms 75.9610 Ops/s 225.9524 Ops/s $\textbf{\color{#d91a1a}-66.38\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.5734ms 2.4453ms 408.9465 Ops/s 401.1125 Ops/s $\color{#35bf28}+1.95\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.5564ms 1.2622ms 792.2848 Ops/s 768.7779 Ops/s $\color{#35bf28}+3.06\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 6.1304ms 4.4615ms 224.1421 Ops/s 218.7304 Ops/s $\color{#35bf28}+2.47\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 6.3360ms 2.4609ms 406.3510 Ops/s 406.6435 Ops/s $\color{#d91a1a}-0.07\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 6.8499ms 1.5199ms 657.9320 Ops/s 668.9654 Ops/s $\color{#d91a1a}-1.65\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 15.7029ms 13.5876ms 73.5967 Ops/s 71.3966 Ops/s $\color{#35bf28}+3.08\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 16.4879ms 15.2788ms 65.4503 Ops/s 65.2823 Ops/s $\color{#35bf28}+0.26\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 24.2598ms 22.0752ms 45.2998 Ops/s 44.4556 Ops/s $\color{#35bf28}+1.90\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 17.1445ms 15.6145ms 64.0432 Ops/s 65.4533 Ops/s $\color{#d91a1a}-2.15\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 28.5987ms 22.5367ms 44.3721 Ops/s 45.2803 Ops/s $\color{#d91a1a}-2.01\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 30.7122ms 17.5481ms 56.9861 Ops/s 60.2146 Ops/s $\textbf{\color{#d91a1a}-5.36\%}$

Copy link

github-actions bot commented Dec 17, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}15$. Worsened: $\large\color{#d91a1a}14$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7115s 0.7109s 1.4067 Ops/s 1.3345 Ops/s $\textbf{\color{#35bf28}+5.40\%}$
test_transformed 0.9727s 0.9695s 1.0314 Ops/s 1.0459 Ops/s $\color{#d91a1a}-1.38\%$
test_serial 2.2079s 2.1258s 0.4704 Ops/s 0.4788 Ops/s $\color{#d91a1a}-1.75\%$
test_parallel 2.0834s 1.9882s 0.5030 Ops/s 0.5134 Ops/s $\color{#d91a1a}-2.03\%$
test_step_mdp_speed[True-True-True-True-True] 0.1898ms 40.7942μs 24.5133 KOps/s 24.9116 KOps/s $\color{#d91a1a}-1.60\%$
test_step_mdp_speed[True-True-True-True-False] 56.6810μs 23.5959μs 42.3803 KOps/s 42.4471 KOps/s $\color{#d91a1a}-0.16\%$
test_step_mdp_speed[True-True-True-False-True] 70.0810μs 22.8410μs 43.7809 KOps/s 44.7864 KOps/s $\color{#d91a1a}-2.24\%$
test_step_mdp_speed[True-True-True-False-False] 39.9100μs 13.1163μs 76.2412 KOps/s 77.4340 KOps/s $\color{#d91a1a}-1.54\%$
test_step_mdp_speed[True-True-False-True-True] 84.9610μs 44.0291μs 22.7123 KOps/s 23.5580 KOps/s $\color{#d91a1a}-3.59\%$
test_step_mdp_speed[True-True-False-True-False] 73.0510μs 26.1327μs 38.2663 KOps/s 39.6545 KOps/s $\color{#d91a1a}-3.50\%$
test_step_mdp_speed[True-True-False-False-True] 64.1210μs 25.2437μs 39.6139 KOps/s 40.6706 KOps/s $\color{#d91a1a}-2.60\%$
test_step_mdp_speed[True-True-False-False-False] 43.4510μs 15.5028μs 64.5046 KOps/s 65.0130 KOps/s $\color{#d91a1a}-0.78\%$
test_step_mdp_speed[True-False-True-True-True] 81.6520μs 45.9989μs 21.7396 KOps/s 22.3901 KOps/s $\color{#d91a1a}-2.91\%$
test_step_mdp_speed[True-False-True-True-False] 64.2310μs 28.5534μs 35.0220 KOps/s 35.6140 KOps/s $\color{#d91a1a}-1.66\%$
test_step_mdp_speed[True-False-True-False-True] 56.6910μs 25.4985μs 39.2180 KOps/s 41.0910 KOps/s $\color{#d91a1a}-4.56\%$
test_step_mdp_speed[True-False-True-False-False] 42.1310μs 15.5628μs 64.2557 KOps/s 64.9201 KOps/s $\color{#d91a1a}-1.02\%$
test_step_mdp_speed[True-False-False-True-True] 78.2820μs 48.3306μs 20.6908 KOps/s 21.4938 KOps/s $\color{#d91a1a}-3.74\%$
test_step_mdp_speed[True-False-False-True-False] 63.1610μs 31.0024μs 32.2556 KOps/s 33.2450 KOps/s $\color{#d91a1a}-2.98\%$
test_step_mdp_speed[True-False-False-False-True] 51.7910μs 27.0077μs 37.0264 KOps/s 37.1182 KOps/s $\color{#d91a1a}-0.25\%$
test_step_mdp_speed[True-False-False-False-False] 47.3910μs 18.0173μs 55.5021 KOps/s 56.9844 KOps/s $\color{#d91a1a}-2.60\%$
test_step_mdp_speed[False-True-True-True-True] 94.4610μs 46.1237μs 21.6808 KOps/s 22.2468 KOps/s $\color{#d91a1a}-2.54\%$
test_step_mdp_speed[False-True-True-True-False] 60.2010μs 28.6970μs 34.8469 KOps/s 35.5783 KOps/s $\color{#d91a1a}-2.06\%$
test_step_mdp_speed[False-True-True-False-True] 67.1810μs 28.8315μs 34.6842 KOps/s 34.9767 KOps/s $\color{#d91a1a}-0.84\%$
test_step_mdp_speed[False-True-True-False-False] 48.7510μs 17.0909μs 58.5106 KOps/s 58.3319 KOps/s $\color{#35bf28}+0.31\%$
test_step_mdp_speed[False-True-False-True-True] 85.3320μs 48.8500μs 20.4708 KOps/s 21.3168 KOps/s $\color{#d91a1a}-3.97\%$
test_step_mdp_speed[False-True-False-True-False] 63.8220μs 30.8056μs 32.4616 KOps/s 32.6319 KOps/s $\color{#d91a1a}-0.52\%$
test_step_mdp_speed[False-True-False-False-True] 2.9914ms 31.7772μs 31.4691 KOps/s 31.8444 KOps/s $\color{#d91a1a}-1.18\%$
test_step_mdp_speed[False-True-False-False-False] 64.7210μs 19.7154μs 50.7218 KOps/s 51.6013 KOps/s $\color{#d91a1a}-1.70\%$
test_step_mdp_speed[False-False-True-True-True] 0.1093ms 50.3442μs 19.8633 KOps/s 20.2635 KOps/s $\color{#d91a1a}-1.98\%$
test_step_mdp_speed[False-False-True-True-False] 58.2810μs 33.2896μs 30.0394 KOps/s 30.1812 KOps/s $\color{#d91a1a}-0.47\%$
test_step_mdp_speed[False-False-True-False-True] 69.1310μs 30.9496μs 32.3106 KOps/s 32.4104 KOps/s $\color{#d91a1a}-0.31\%$
test_step_mdp_speed[False-False-True-False-False] 44.7610μs 19.6769μs 50.8209 KOps/s 51.4734 KOps/s $\color{#d91a1a}-1.27\%$
test_step_mdp_speed[False-False-False-True-True] 93.9810μs 52.8455μs 18.9231 KOps/s 19.4053 KOps/s $\color{#d91a1a}-2.48\%$
test_step_mdp_speed[False-False-False-True-False] 70.8010μs 35.5552μs 28.1253 KOps/s 28.6714 KOps/s $\color{#d91a1a}-1.90\%$
test_step_mdp_speed[False-False-False-False-True] 62.0010μs 32.8056μs 30.4826 KOps/s 30.9521 KOps/s $\color{#d91a1a}-1.52\%$
test_step_mdp_speed[False-False-False-False-False] 52.9310μs 21.7195μs 46.0416 KOps/s 46.5566 KOps/s $\color{#d91a1a}-1.11\%$
test_values[generalized_advantage_estimate-True-True] 25.4166ms 24.9798ms 40.0323 Ops/s 41.2412 Ops/s $\color{#d91a1a}-2.93\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1055s 3.0033ms 332.9676 Ops/s 348.8355 Ops/s $\color{#d91a1a}-4.55\%$
test_values[td0_return_estimate-False-False] 0.1043ms 80.2373μs 12.4630 KOps/s 12.6388 KOps/s $\color{#d91a1a}-1.39\%$
test_values[td1_return_estimate-False-False] 55.6559ms 55.3302ms 18.0733 Ops/s 17.5882 Ops/s $\color{#35bf28}+2.76\%$
test_values[vec_td1_return_estimate-False-False] 1.2763ms 1.0740ms 931.1418 Ops/s 933.6755 Ops/s $\color{#d91a1a}-0.27\%$
test_values[td_lambda_return_estimate-True-False] 88.1215ms 87.6787ms 11.4053 Ops/s 11.1120 Ops/s $\color{#35bf28}+2.64\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.2317ms 1.0700ms 934.5719 Ops/s 945.2749 Ops/s $\color{#d91a1a}-1.13\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.7402ms 24.5189ms 40.7848 Ops/s 39.1375 Ops/s $\color{#35bf28}+4.21\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0981ms 0.7601ms 1.3156 KOps/s 1.3282 KOps/s $\color{#d91a1a}-0.95\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7592ms 0.6649ms 1.5040 KOps/s 1.5005 KOps/s $\color{#35bf28}+0.24\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5522ms 1.4827ms 674.4493 Ops/s 678.2278 Ops/s $\color{#d91a1a}-0.56\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7200ms 0.6798ms 1.4709 KOps/s 1.4305 KOps/s $\color{#35bf28}+2.83\%$
test_dqn_speed[False-None] 6.7282ms 1.5345ms 651.6797 Ops/s 665.8051 Ops/s $\color{#d91a1a}-2.12\%$
test_dqn_speed[False-backward] 2.1695ms 2.1235ms 470.9221 Ops/s 482.2305 Ops/s $\color{#d91a1a}-2.35\%$
test_dqn_speed[True-None] 0.5967ms 0.5363ms 1.8647 KOps/s 1.8745 KOps/s $\color{#d91a1a}-0.52\%$
test_dqn_speed[True-backward] 1.5112ms 1.1043ms 905.5117 Ops/s 915.2953 Ops/s $\color{#d91a1a}-1.07\%$
test_dqn_speed[reduce-overhead-None] 0.6968ms 0.5622ms 1.7788 KOps/s 1.8223 KOps/s $\color{#d91a1a}-2.39\%$
test_dqn_speed[reduce-overhead-backward] 1.0103ms 0.9630ms 1.0384 KOps/s 1.0338 KOps/s $\color{#35bf28}+0.45\%$
test_ddpg_speed[False-None] 3.1942ms 2.8619ms 349.4173 Ops/s 355.1729 Ops/s $\color{#d91a1a}-1.62\%$
test_ddpg_speed[False-backward] 4.5461ms 4.0927ms 244.3368 Ops/s 247.4201 Ops/s $\color{#d91a1a}-1.25\%$
test_ddpg_speed[True-None] 1.1766ms 1.0885ms 918.6955 Ops/s 920.8043 Ops/s $\color{#d91a1a}-0.23\%$
test_ddpg_speed[True-backward] 2.2575ms 2.1761ms 459.5278 Ops/s 436.5004 Ops/s $\textbf{\color{#35bf28}+5.28\%}$
test_ddpg_speed[reduce-overhead-None] 1.1710ms 1.1020ms 907.4767 Ops/s 888.8506 Ops/s $\color{#35bf28}+2.10\%$
test_ddpg_speed[reduce-overhead-backward] 1.7436ms 1.6490ms 606.4163 Ops/s 561.2429 Ops/s $\textbf{\color{#35bf28}+8.05\%}$
test_sac_speed[False-None] 8.5193ms 8.0997ms 123.4606 Ops/s 126.0720 Ops/s $\color{#d91a1a}-2.07\%$
test_sac_speed[False-backward] 11.7098ms 11.0537ms 90.4676 Ops/s 90.3450 Ops/s $\color{#35bf28}+0.14\%$
test_sac_speed[True-None] 1.6282ms 1.5618ms 640.2918 Ops/s 647.1313 Ops/s $\color{#d91a1a}-1.06\%$
test_sac_speed[True-backward] 3.5223ms 3.4764ms 287.6509 Ops/s 312.0400 Ops/s $\textbf{\color{#d91a1a}-7.82\%}$
test_sac_speed[reduce-overhead-None] 23.5696ms 12.6965ms 78.7620 Ops/s 79.7919 Ops/s $\color{#d91a1a}-1.29\%$
test_sac_speed[reduce-overhead-backward] 1.5786ms 1.4992ms 667.0434 Ops/s 745.9025 Ops/s $\textbf{\color{#d91a1a}-10.57\%}$
test_redq_speed[False-None] 8.2807ms 7.5396ms 132.6329 Ops/s 133.7850 Ops/s $\color{#d91a1a}-0.86\%$
test_redq_speed[False-backward] 12.5220ms 11.6102ms 86.1308 Ops/s 89.8234 Ops/s $\color{#d91a1a}-4.11\%$
test_redq_speed[True-None] 2.0668ms 2.0192ms 495.2361 Ops/s 503.3010 Ops/s $\color{#d91a1a}-1.60\%$
test_redq_speed[True-backward] 4.2106ms 3.7070ms 269.7594 Ops/s 271.9748 Ops/s $\color{#d91a1a}-0.81\%$
test_redq_speed[reduce-overhead-None] 2.1044ms 2.0198ms 495.1035 Ops/s 493.1738 Ops/s $\color{#35bf28}+0.39\%$
test_redq_speed[reduce-overhead-backward] 3.9377ms 3.8921ms 256.9319 Ops/s 261.0607 Ops/s $\color{#d91a1a}-1.58\%$
test_redq_deprec_speed[False-None] 9.9605ms 9.1142ms 109.7191 Ops/s 110.4345 Ops/s $\color{#d91a1a}-0.65\%$
test_redq_deprec_speed[False-backward] 12.6965ms 12.2329ms 81.7467 Ops/s 81.6070 Ops/s $\color{#35bf28}+0.17\%$
test_redq_deprec_speed[True-None] 2.5363ms 2.3975ms 417.0996 Ops/s 433.6442 Ops/s $\color{#d91a1a}-3.82\%$
test_redq_deprec_speed[True-backward] 4.6733ms 4.2436ms 235.6467 Ops/s 251.1922 Ops/s $\textbf{\color{#d91a1a}-6.19\%}$
test_redq_deprec_speed[reduce-overhead-None] 2.4366ms 2.3676ms 422.3648 Ops/s 428.8907 Ops/s $\color{#d91a1a}-1.52\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.6676ms 4.2553ms 234.9998 Ops/s 241.0746 Ops/s $\color{#d91a1a}-2.52\%$
test_td3_speed[False-None] 34.6570ms 8.2134ms 121.7527 Ops/s 127.4695 Ops/s $\color{#d91a1a}-4.48\%$
test_td3_speed[False-backward] 10.9089ms 10.4521ms 95.6742 Ops/s 96.8696 Ops/s $\color{#d91a1a}-1.23\%$
test_td3_speed[True-None] 1.6360ms 1.6154ms 619.0367 Ops/s 622.7899 Ops/s $\color{#d91a1a}-0.60\%$
test_td3_speed[True-backward] 3.4170ms 3.3463ms 298.8375 Ops/s 303.5224 Ops/s $\color{#d91a1a}-1.54\%$
test_td3_speed[reduce-overhead-None] 51.6345ms 26.3055ms 38.0149 Ops/s 36.7976 Ops/s $\color{#35bf28}+3.31\%$
test_td3_speed[reduce-overhead-backward] 1.4983ms 1.4381ms 695.3767 Ops/s 770.2880 Ops/s $\textbf{\color{#d91a1a}-9.73\%}$
test_cql_speed[False-None] 17.6175ms 16.8966ms 59.1835 Ops/s 59.6600 Ops/s $\color{#d91a1a}-0.80\%$
test_cql_speed[False-backward] 22.8922ms 22.1882ms 45.0690 Ops/s 45.8696 Ops/s $\color{#d91a1a}-1.75\%$
test_cql_speed[True-None] 3.1618ms 3.0117ms 332.0344 Ops/s 331.2413 Ops/s $\color{#35bf28}+0.24\%$
test_cql_speed[True-backward] 5.7274ms 5.1942ms 192.5212 Ops/s 189.0631 Ops/s $\color{#35bf28}+1.83\%$
test_cql_speed[reduce-overhead-None] 21.7925ms 13.2700ms 75.3582 Ops/s 75.4509 Ops/s $\color{#d91a1a}-0.12\%$
test_cql_speed[reduce-overhead-backward] 1.5690ms 1.5017ms 665.8936 Ops/s 586.7687 Ops/s $\textbf{\color{#35bf28}+13.48\%}$
test_a2c_speed[False-None] 3.4281ms 3.2490ms 307.7895 Ops/s 316.4489 Ops/s $\color{#d91a1a}-2.74\%$
test_a2c_speed[False-backward] 6.6347ms 6.0425ms 165.4945 Ops/s 160.5276 Ops/s $\color{#35bf28}+3.09\%$
test_a2c_speed[True-None] 1.1034ms 1.0226ms 977.8786 Ops/s 995.2927 Ops/s $\color{#d91a1a}-1.75\%$
test_a2c_speed[True-backward] 3.0852ms 2.6492ms 377.4693 Ops/s 361.2962 Ops/s $\color{#35bf28}+4.48\%$
test_a2c_speed[reduce-overhead-None] 20.8477ms 11.4289ms 87.4975 Ops/s 88.3447 Ops/s $\color{#d91a1a}-0.96\%$
test_a2c_speed[reduce-overhead-backward] 0.9950ms 0.9661ms 1.0351 KOps/s 870.9611 Ops/s $\textbf{\color{#35bf28}+18.84\%}$
test_ppo_speed[False-None] 3.7672ms 3.6785ms 271.8491 Ops/s 278.5988 Ops/s $\color{#d91a1a}-2.42\%$
test_ppo_speed[False-backward] 7.1762ms 6.7063ms 149.1140 Ops/s 145.1799 Ops/s $\color{#35bf28}+2.71\%$
test_ppo_speed[True-None] 1.0661ms 0.9781ms 1.0224 KOps/s 1.0592 KOps/s $\color{#d91a1a}-3.48\%$
test_ppo_speed[True-backward] 2.6553ms 2.5575ms 391.0088 Ops/s 370.0515 Ops/s $\textbf{\color{#35bf28}+5.66\%}$
test_ppo_speed[reduce-overhead-None] 0.6142ms 0.5123ms 1.9520 KOps/s 1.8833 KOps/s $\color{#35bf28}+3.65\%$
test_ppo_speed[reduce-overhead-backward] 1.0169ms 0.9511ms 1.0514 KOps/s 1.0127 KOps/s $\color{#35bf28}+3.82\%$
test_reinforce_speed[False-None] 2.4812ms 2.2799ms 438.6085 Ops/s 440.1708 Ops/s $\color{#d91a1a}-0.35\%$
test_reinforce_speed[False-backward] 3.6828ms 3.2445ms 308.2135 Ops/s 307.6971 Ops/s $\color{#35bf28}+0.17\%$
test_reinforce_speed[True-None] 0.9452ms 0.8383ms 1.1929 KOps/s 1.2035 KOps/s $\color{#d91a1a}-0.88\%$
test_reinforce_speed[True-backward] 2.6343ms 2.4490ms 408.3311 Ops/s 412.6146 Ops/s $\color{#d91a1a}-1.04\%$
test_reinforce_speed[reduce-overhead-None] 21.9284ms 11.6817ms 85.6039 Ops/s 88.4183 Ops/s $\color{#d91a1a}-3.18\%$
test_reinforce_speed[reduce-overhead-backward] 1.2204ms 1.1617ms 860.7820 Ops/s 826.5926 Ops/s $\color{#35bf28}+4.14\%$
test_iql_speed[False-None] 9.8358ms 9.3688ms 106.7374 Ops/s 108.5767 Ops/s $\color{#d91a1a}-1.69\%$
test_iql_speed[False-backward] 13.7036ms 13.1474ms 76.0604 Ops/s 76.5349 Ops/s $\color{#d91a1a}-0.62\%$
test_iql_speed[True-None] 1.8538ms 1.7992ms 555.8073 Ops/s 572.0836 Ops/s $\color{#d91a1a}-2.85\%$
test_iql_speed[True-backward] 4.8159ms 4.4763ms 223.4003 Ops/s 227.6028 Ops/s $\color{#d91a1a}-1.85\%$
test_iql_speed[reduce-overhead-None] 19.8806ms 11.5051ms 86.9177 Ops/s 89.7988 Ops/s $\color{#d91a1a}-3.21\%$
test_iql_speed[reduce-overhead-backward] 1.6329ms 1.5911ms 628.4772 Ops/s 703.5188 Ops/s $\textbf{\color{#d91a1a}-10.67\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.9078ms 6.3945ms 156.3845 Ops/s 152.7452 Ops/s $\color{#35bf28}+2.38\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6878ms 0.2833ms 3.5301 KOps/s 2.6355 KOps/s $\textbf{\color{#35bf28}+33.95\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5095ms 0.3288ms 3.0418 KOps/s 2.7773 KOps/s $\textbf{\color{#35bf28}+9.52\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4339ms 6.1468ms 162.6873 Ops/s 158.4880 Ops/s $\color{#35bf28}+2.65\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.1435ms 0.3297ms 3.0329 KOps/s 3.3511 KOps/s $\textbf{\color{#d91a1a}-9.50\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5393ms 0.3337ms 2.9968 KOps/s 4.2258 KOps/s $\textbf{\color{#d91a1a}-29.08\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6307ms 1.3868ms 721.0706 Ops/s 807.5286 Ops/s $\textbf{\color{#d91a1a}-10.71\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5482ms 1.3346ms 749.3124 Ops/s 855.0844 Ops/s $\textbf{\color{#d91a1a}-12.37\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5121ms 6.3833ms 156.6585 Ops/s 155.1293 Ops/s $\color{#35bf28}+0.99\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.9069ms 0.5012ms 1.9952 KOps/s 2.3072 KOps/s $\textbf{\color{#d91a1a}-13.52\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7127ms 0.4741ms 2.1093 KOps/s 2.3783 KOps/s $\textbf{\color{#d91a1a}-11.31\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.3627ms 6.1676ms 162.1385 Ops/s 159.0060 Ops/s $\color{#35bf28}+1.97\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9749ms 0.3230ms 3.0955 KOps/s 3.2086 KOps/s $\color{#d91a1a}-3.52\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5541ms 0.3116ms 3.2093 KOps/s 3.0736 KOps/s $\color{#35bf28}+4.42\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3780ms 6.1444ms 162.7492 Ops/s 160.4018 Ops/s $\color{#35bf28}+1.46\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.5851ms 0.3072ms 3.2556 KOps/s 3.0401 KOps/s $\textbf{\color{#35bf28}+7.09\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5594ms 0.3210ms 3.1150 KOps/s 3.1554 KOps/s $\color{#d91a1a}-1.28\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4770ms 6.2721ms 159.4369 Ops/s 155.2321 Ops/s $\color{#35bf28}+2.71\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0011ms 0.4081ms 2.4501 KOps/s 1.9563 KOps/s $\textbf{\color{#35bf28}+25.24\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7197ms 0.4777ms 2.0935 KOps/s 2.2555 KOps/s $\textbf{\color{#d91a1a}-7.18\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.9714ms 5.3491ms 186.9476 Ops/s 186.6266 Ops/s $\color{#35bf28}+0.17\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 6.4529ms 1.9618ms 509.7477 Ops/s 444.3611 Ops/s $\textbf{\color{#35bf28}+14.71\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 8.5973ms 1.2811ms 780.5502 Ops/s 850.5156 Ops/s $\textbf{\color{#d91a1a}-8.23\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 8.5155ms 5.4550ms 183.3178 Ops/s 187.6058 Ops/s $\color{#d91a1a}-2.29\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.4637ms 2.0173ms 495.7060 Ops/s 437.4060 Ops/s $\textbf{\color{#35bf28}+13.33\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 8.0863ms 1.2362ms 808.9069 Ops/s 867.8229 Ops/s $\textbf{\color{#d91a1a}-6.79\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5111s 15.7293ms 63.5758 Ops/s 33.5740 Ops/s $\textbf{\color{#35bf28}+89.36\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 10.1523ms 2.1397ms 467.3648 Ops/s 454.2814 Ops/s $\color{#35bf28}+2.88\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.1576ms 1.3566ms 737.1504 Ops/s 708.3244 Ops/s $\color{#35bf28}+4.07\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.8255ms 13.5956ms 73.5531 Ops/s 72.2420 Ops/s $\color{#35bf28}+1.81\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.6944ms 17.4097ms 57.4391 Ops/s 58.0073 Ops/s $\color{#d91a1a}-0.98\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.2488ms 17.9153ms 55.8181 Ops/s 53.0301 Ops/s $\textbf{\color{#35bf28}+5.26\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.6174ms 17.5204ms 57.0763 Ops/s 56.3758 Ops/s $\color{#35bf28}+1.24\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 17.8550ms 17.6673ms 56.6017 Ops/s 53.8123 Ops/s $\textbf{\color{#35bf28}+5.18\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 20.2192ms 18.7666ms 53.2861 Ops/s 51.3599 Ops/s $\color{#35bf28}+3.75\%$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 17, 2024
ghstack-source-id: cc2e120f70212f6b34ba439a8425cb1d9c49d8f0
Pull Request resolved: #2662
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 17, 2024
ghstack-source-id: 11dff37f598411133c4d6e61f4c760bd5abf6a08
Pull Request resolved: #2662
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 17, 2024
ghstack-source-id: 183a57a0e3630b031448c3bc87d0a1cf49ad73bc
Pull Request resolved: #2662
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 17, 2024
ghstack-source-id: 1576be36aeaff4333580d450f44265ae54e4a34b
Pull Request resolved: #2662
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 17, 2024
ghstack-source-id: 055d649dcd47df788c28a19e84d1aace7abc08a5
Pull Request resolved: #2662
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 17, 2024
ghstack-source-id: 110e906b617f644465ae4ff1360d8b644bf5be6f
Pull Request resolved: #2662
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 17, 2024
ghstack-source-id: d7338d4c6223db1079a4cd4d4f2ece398c63ed92
Pull Request resolved: #2662
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 18, 2024
ghstack-source-id: 283cc1bb4ad2d60281296d2cfb78ec41c77f4129
Pull Request resolved: #2662
@vmoens vmoens merged commit ace76f3 into gh/vmoens/58/base Dec 18, 2024
70 of 78 checks passed
vmoens added a commit that referenced this pull request Dec 18, 2024
ghstack-source-id: 283cc1bb4ad2d60281296d2cfb78ec41c77f4129
Pull Request resolved: #2662
@vmoens vmoens deleted the gh/vmoens/58/head branch December 18, 2024 15:31
@vmoens vmoens added the enhancement New feature or request label Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants