Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] TD3 compatibility with compile #2658

Merged
merged 1 commit into from
Dec 16, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 16, 2024

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Dec 16, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2658

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 7 Unrelated Failures

As of commit 4626ab3 with merge base 87a59fb (image):

NEW FAILURE - The following job has failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 16, 2024
@vmoens vmoens added the enhancement New feature or request label Dec 16, 2024
@vmoens vmoens merged commit 4626ab3 into gh/vmoens/59/base Dec 16, 2024
62 of 66 checks passed
@vmoens vmoens deleted the gh/vmoens/59/head branch December 16, 2024 04:14
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}8$. Worsened: $\large\color{#d91a1a}10$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4228s 0.4199s 2.3818 Ops/s 2.2481 Ops/s $\textbf{\color{#35bf28}+5.94\%}$
test_transformed 0.6066s 0.6037s 1.6564 Ops/s 1.6288 Ops/s $\color{#35bf28}+1.69\%$
test_serial 1.3556s 1.3459s 0.7430 Ops/s 0.7320 Ops/s $\color{#35bf28}+1.51\%$
test_parallel 1.2863s 1.2784s 0.7822 Ops/s 0.7592 Ops/s $\color{#35bf28}+3.04\%$
test_step_mdp_speed[True-True-True-True-True] 0.1695ms 29.4162μs 33.9949 KOps/s 33.2363 KOps/s $\color{#35bf28}+2.28\%$
test_step_mdp_speed[True-True-True-True-False] 68.8610μs 17.3520μs 57.6304 KOps/s 56.6581 KOps/s $\color{#35bf28}+1.72\%$
test_step_mdp_speed[True-True-True-False-True] 41.6080μs 16.7827μs 59.5853 KOps/s 58.7620 KOps/s $\color{#35bf28}+1.40\%$
test_step_mdp_speed[True-True-True-False-False] 33.5630μs 9.9712μs 100.2889 KOps/s 101.3973 KOps/s $\color{#d91a1a}-1.09\%$
test_step_mdp_speed[True-True-False-True-True] 70.1610μs 31.8503μs 31.3969 KOps/s 30.6803 KOps/s $\color{#35bf28}+2.34\%$
test_step_mdp_speed[True-True-False-True-False] 49.5030μs 19.4868μs 51.3167 KOps/s 50.8951 KOps/s $\color{#35bf28}+0.83\%$
test_step_mdp_speed[True-True-False-False-True] 46.5870μs 18.6988μs 53.4795 KOps/s 52.5595 KOps/s $\color{#35bf28}+1.75\%$
test_step_mdp_speed[True-True-False-False-False] 36.2080μs 12.0846μs 82.7498 KOps/s 85.0142 KOps/s $\color{#d91a1a}-2.66\%$
test_step_mdp_speed[True-False-True-True-True] 72.4550μs 34.1722μs 29.2635 KOps/s 29.3419 KOps/s $\color{#d91a1a}-0.27\%$
test_step_mdp_speed[True-False-True-True-False] 62.9870μs 21.5177μs 46.4734 KOps/s 46.1159 KOps/s $\color{#35bf28}+0.78\%$
test_step_mdp_speed[True-False-True-False-True] 42.7200μs 18.6748μs 53.5482 KOps/s 52.3114 KOps/s $\color{#35bf28}+2.36\%$
test_step_mdp_speed[True-False-True-False-False] 40.8760μs 11.9389μs 83.7595 KOps/s 83.8643 KOps/s $\color{#d91a1a}-0.13\%$
test_step_mdp_speed[True-False-False-True-True] 74.4690μs 35.5391μs 28.1380 KOps/s 27.4811 KOps/s $\color{#35bf28}+2.39\%$
test_step_mdp_speed[True-False-False-True-False] 54.5920μs 23.5323μs 42.4948 KOps/s 42.7662 KOps/s $\color{#d91a1a}-0.63\%$
test_step_mdp_speed[True-False-False-False-True] 50.3040μs 20.5830μs 48.5838 KOps/s 48.5300 KOps/s $\color{#35bf28}+0.11\%$
test_step_mdp_speed[True-False-False-False-False] 36.6990μs 13.6564μs 73.2258 KOps/s 72.6051 KOps/s $\color{#35bf28}+0.85\%$
test_step_mdp_speed[False-True-True-True-True] 0.6265ms 33.3006μs 30.0295 KOps/s 29.0386 KOps/s $\color{#35bf28}+3.41\%$
test_step_mdp_speed[False-True-True-True-False] 50.8850μs 21.5388μs 46.4279 KOps/s 46.3136 KOps/s $\color{#35bf28}+0.25\%$
test_step_mdp_speed[False-True-True-False-True] 59.1900μs 21.0681μs 47.4652 KOps/s 46.3178 KOps/s $\color{#35bf28}+2.48\%$
test_step_mdp_speed[False-True-True-False-False] 39.8150μs 13.2167μs 75.6618 KOps/s 75.8452 KOps/s $\color{#d91a1a}-0.24\%$
test_step_mdp_speed[False-True-False-True-True] 0.1024ms 35.3982μs 28.2500 KOps/s 28.1331 KOps/s $\color{#35bf28}+0.42\%$
test_step_mdp_speed[False-True-False-True-False] 81.9130μs 22.8466μs 43.7703 KOps/s 43.1672 KOps/s $\color{#35bf28}+1.40\%$
test_step_mdp_speed[False-True-False-False-True] 2.7458ms 22.7366μs 43.9819 KOps/s 43.5133 KOps/s $\color{#35bf28}+1.08\%$
test_step_mdp_speed[False-True-False-False-False] 43.6120μs 14.8351μs 67.4076 KOps/s 67.5929 KOps/s $\color{#d91a1a}-0.27\%$
test_step_mdp_speed[False-False-True-True-True] 72.1550μs 36.6774μs 27.2648 KOps/s 26.9260 KOps/s $\color{#35bf28}+1.26\%$
test_step_mdp_speed[False-False-True-True-False] 66.0040μs 24.6252μs 40.6088 KOps/s 40.4518 KOps/s $\color{#35bf28}+0.39\%$
test_step_mdp_speed[False-False-True-False-True] 72.3750μs 22.3857μs 44.6714 KOps/s 43.3443 KOps/s $\color{#35bf28}+3.06\%$
test_step_mdp_speed[False-False-True-False-False] 50.1440μs 14.9803μs 66.7544 KOps/s 67.6283 KOps/s $\color{#d91a1a}-1.29\%$
test_step_mdp_speed[False-False-False-True-True] 83.4450μs 38.7218μs 25.8252 KOps/s 25.2941 KOps/s $\color{#35bf28}+2.10\%$
test_step_mdp_speed[False-False-False-True-False] 60.1020μs 26.7167μs 37.4298 KOps/s 37.6587 KOps/s $\color{#d91a1a}-0.61\%$
test_step_mdp_speed[False-False-False-False-True] 58.7200μs 24.3909μs 40.9989 KOps/s 40.3612 KOps/s $\color{#35bf28}+1.58\%$
test_step_mdp_speed[False-False-False-False-False] 48.4900μs 16.7529μs 59.6911 KOps/s 60.5257 KOps/s $\color{#d91a1a}-1.38\%$
test_values[generalized_advantage_estimate-True-True] 9.6556ms 9.3428ms 107.0346 Ops/s 107.0830 Ops/s $\color{#d91a1a}-0.05\%$
test_values[vec_generalized_advantage_estimate-True-True] 37.9134ms 35.8690ms 27.8792 Ops/s 30.1503 Ops/s $\textbf{\color{#d91a1a}-7.53\%}$
test_values[td0_return_estimate-False-False] 0.2371ms 0.1723ms 5.8026 KOps/s 5.6849 KOps/s $\color{#35bf28}+2.07\%$
test_values[td1_return_estimate-False-False] 27.3221ms 23.7820ms 42.0485 Ops/s 42.4690 Ops/s $\color{#d91a1a}-0.99\%$
test_values[vec_td1_return_estimate-False-False] 37.5342ms 35.9345ms 27.8284 Ops/s 29.6759 Ops/s $\textbf{\color{#d91a1a}-6.23\%}$
test_values[td_lambda_return_estimate-True-False] 35.2101ms 33.9648ms 29.4423 Ops/s 29.0801 Ops/s $\color{#35bf28}+1.25\%$
test_values[vec_td_lambda_return_estimate-True-False] 37.9314ms 35.9538ms 27.8135 Ops/s 30.0844 Ops/s $\textbf{\color{#d91a1a}-7.55\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.3159ms 8.1378ms 122.8835 Ops/s 122.2005 Ops/s $\color{#35bf28}+0.56\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 3.6212ms 1.9730ms 506.8540 Ops/s 439.7658 Ops/s $\textbf{\color{#35bf28}+15.26\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4993ms 0.3529ms 2.8337 KOps/s 2.7834 KOps/s $\color{#35bf28}+1.81\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 50.1247ms 47.4399ms 21.0793 Ops/s 24.7253 Ops/s $\textbf{\color{#d91a1a}-14.75\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.8849ms 3.0074ms 332.5179 Ops/s 330.3368 Ops/s $\color{#35bf28}+0.66\%$
test_dqn_speed[False-None] 2.0190ms 1.3519ms 739.7138 Ops/s 714.5343 Ops/s $\color{#35bf28}+3.52\%$
test_dqn_speed[False-backward] 2.4240ms 1.8683ms 535.2556 Ops/s 525.6186 Ops/s $\color{#35bf28}+1.83\%$
test_dqn_speed[True-None] 0.7634ms 0.4516ms 2.2143 KOps/s 1.8362 KOps/s $\textbf{\color{#35bf28}+20.59\%}$
test_dqn_speed[True-backward] 0.9425ms 0.8685ms 1.1515 KOps/s 1.1215 KOps/s $\color{#35bf28}+2.67\%$
test_dqn_speed[reduce-overhead-None] 0.7838ms 0.4541ms 2.2021 KOps/s 2.1525 KOps/s $\color{#35bf28}+2.30\%$
test_dqn_speed[reduce-overhead-backward] 1.0235ms 0.8858ms 1.1289 KOps/s 1.0730 KOps/s $\textbf{\color{#35bf28}+5.20\%}$
test_ddpg_speed[False-None] 3.5227ms 2.8107ms 355.7806 Ops/s 347.7951 Ops/s $\color{#35bf28}+2.30\%$
test_ddpg_speed[False-backward] 4.0773ms 3.9159ms 255.3672 Ops/s 252.9126 Ops/s $\color{#35bf28}+0.97\%$
test_ddpg_speed[True-None] 1.3873ms 0.9795ms 1.0209 KOps/s 1.0101 KOps/s $\color{#35bf28}+1.07\%$
test_ddpg_speed[True-backward] 1.8992ms 1.8513ms 540.1732 Ops/s 522.1883 Ops/s $\color{#35bf28}+3.44\%$
test_ddpg_speed[reduce-overhead-None] 1.3797ms 0.9768ms 1.0237 KOps/s 1.0023 KOps/s $\color{#35bf28}+2.14\%$
test_ddpg_speed[reduce-overhead-backward] 1.8791ms 1.8484ms 540.9954 Ops/s 530.7875 Ops/s $\color{#35bf28}+1.92\%$
test_sac_speed[False-None] 8.7699ms 7.8869ms 126.7931 Ops/s 124.7457 Ops/s $\color{#35bf28}+1.64\%$
test_sac_speed[False-backward] 11.7429ms 10.5948ms 94.3856 Ops/s 92.5703 Ops/s $\color{#35bf28}+1.96\%$
test_sac_speed[True-None] 2.0768ms 1.7936ms 557.5429 Ops/s 547.6235 Ops/s $\color{#35bf28}+1.81\%$
test_sac_speed[True-backward] 3.6444ms 3.5509ms 281.6152 Ops/s 285.3190 Ops/s $\color{#d91a1a}-1.30\%$
test_sac_speed[reduce-overhead-None] 2.3944ms 1.8087ms 552.8764 Ops/s 548.4408 Ops/s $\color{#35bf28}+0.81\%$
test_sac_speed[reduce-overhead-backward] 3.5607ms 3.4699ms 288.1961 Ops/s 284.3356 Ops/s $\color{#35bf28}+1.36\%$
test_redq_speed[False-None] 20.0890ms 14.3345ms 69.7618 Ops/s 77.7860 Ops/s $\textbf{\color{#d91a1a}-10.32\%}$
test_redq_speed[False-backward] 23.3693ms 21.9093ms 45.6427 Ops/s 44.4543 Ops/s $\color{#35bf28}+2.67\%$
test_redq_speed[True-None] 5.4562ms 4.4496ms 224.7385 Ops/s 224.8009 Ops/s $\color{#d91a1a}-0.03\%$
test_redq_speed[True-backward] 12.0980ms 11.8397ms 84.4614 Ops/s 86.6963 Ops/s $\color{#d91a1a}-2.58\%$
test_redq_speed[reduce-overhead-None] 5.5683ms 4.6141ms 216.7249 Ops/s 224.9322 Ops/s $\color{#d91a1a}-3.65\%$
test_redq_speed[reduce-overhead-backward] 13.2255ms 12.1001ms 82.6439 Ops/s 86.0075 Ops/s $\color{#d91a1a}-3.91\%$
test_redq_deprec_speed[False-None] 14.5433ms 12.5162ms 79.8962 Ops/s 78.3837 Ops/s $\color{#35bf28}+1.93\%$
test_redq_deprec_speed[False-backward] 19.6233ms 18.1702ms 55.0351 Ops/s 54.5020 Ops/s $\color{#35bf28}+0.98\%$
test_redq_deprec_speed[True-None] 6.8857ms 3.5911ms 278.4667 Ops/s 278.9303 Ops/s $\color{#d91a1a}-0.17\%$
test_redq_deprec_speed[True-backward] 10.7079ms 8.5183ms 117.3948 Ops/s 125.5098 Ops/s $\textbf{\color{#d91a1a}-6.47\%}$
test_redq_deprec_speed[reduce-overhead-None] 4.2017ms 3.5243ms 283.7476 Ops/s 281.8647 Ops/s $\color{#35bf28}+0.67\%$
test_redq_deprec_speed[reduce-overhead-backward] 8.3750ms 7.9538ms 125.7261 Ops/s 125.7137 Ops/s $+0.01\%$
test_td3_speed[False-None] 9.7322ms 7.9914ms 125.1350 Ops/s 123.8744 Ops/s $\color{#35bf28}+1.02\%$
test_td3_speed[False-backward] 10.8396ms 10.2594ms 97.4717 Ops/s 96.4420 Ops/s $\color{#35bf28}+1.07\%$
test_td3_speed[True-None] 1.7768ms 1.6911ms 591.3331 Ops/s 571.2427 Ops/s $\color{#35bf28}+3.52\%$
test_td3_speed[True-backward] 3.4410ms 3.3137ms 301.7755 Ops/s 298.4554 Ops/s $\color{#35bf28}+1.11\%$
test_td3_speed[reduce-overhead-None] 1.8147ms 1.6878ms 592.4737 Ops/s 579.1564 Ops/s $\color{#35bf28}+2.30\%$
test_td3_speed[reduce-overhead-backward] 3.4449ms 3.3006ms 302.9790 Ops/s 301.7488 Ops/s $\color{#35bf28}+0.41\%$
test_cql_speed[False-None] 39.4263ms 35.9541ms 27.8133 Ops/s 27.4769 Ops/s $\color{#35bf28}+1.22\%$
test_cql_speed[False-backward] 48.3428ms 45.6835ms 21.8898 Ops/s 21.2397 Ops/s $\color{#35bf28}+3.06\%$
test_cql_speed[True-None] 16.4108ms 15.4382ms 64.7744 Ops/s 62.7351 Ops/s $\color{#35bf28}+3.25\%$
test_cql_speed[True-backward] 23.2685ms 22.1606ms 45.1251 Ops/s 45.1800 Ops/s $\color{#d91a1a}-0.12\%$
test_cql_speed[reduce-overhead-None] 16.5735ms 15.4568ms 64.6964 Ops/s 64.7336 Ops/s $\color{#d91a1a}-0.06\%$
test_cql_speed[reduce-overhead-backward] 23.5524ms 21.6762ms 46.1336 Ops/s 46.4191 Ops/s $\color{#d91a1a}-0.62\%$
test_a2c_speed[False-None] 8.1957ms 7.1200ms 140.4492 Ops/s 138.3712 Ops/s $\color{#35bf28}+1.50\%$
test_a2c_speed[False-backward] 14.4271ms 14.1098ms 70.8726 Ops/s 70.3923 Ops/s $\color{#35bf28}+0.68\%$
test_a2c_speed[True-None] 4.6735ms 4.1918ms 238.5582 Ops/s 238.1572 Ops/s $\color{#35bf28}+0.17\%$
test_a2c_speed[True-backward] 11.8771ms 10.6548ms 93.8545 Ops/s 94.2116 Ops/s $\color{#d91a1a}-0.38\%$
test_a2c_speed[reduce-overhead-None] 4.9566ms 4.2739ms 233.9806 Ops/s 236.9968 Ops/s $\color{#d91a1a}-1.27\%$
test_a2c_speed[reduce-overhead-backward] 11.5486ms 10.7732ms 92.8233 Ops/s 93.7884 Ops/s $\color{#d91a1a}-1.03\%$
test_ppo_speed[False-None] 8.0149ms 7.4623ms 134.0074 Ops/s 134.0428 Ops/s $\color{#d91a1a}-0.03\%$
test_ppo_speed[False-backward] 15.5483ms 15.0279ms 66.5428 Ops/s 68.6312 Ops/s $\color{#d91a1a}-3.04\%$
test_ppo_speed[True-None] 4.0800ms 3.7377ms 267.5467 Ops/s 269.8220 Ops/s $\color{#d91a1a}-0.84\%$
test_ppo_speed[True-backward] 11.0111ms 9.7100ms 102.9864 Ops/s 104.7381 Ops/s $\color{#d91a1a}-1.67\%$
test_ppo_speed[reduce-overhead-None] 4.7039ms 3.7206ms 268.7725 Ops/s 272.2102 Ops/s $\color{#d91a1a}-1.26\%$
test_ppo_speed[reduce-overhead-backward] 10.2087ms 9.7090ms 102.9972 Ops/s 104.7150 Ops/s $\color{#d91a1a}-1.64\%$
test_reinforce_speed[False-None] 7.6099ms 6.6114ms 151.2542 Ops/s 152.8486 Ops/s $\color{#d91a1a}-1.04\%$
test_reinforce_speed[False-backward] 11.5689ms 9.7840ms 102.2075 Ops/s 102.4187 Ops/s $\color{#d91a1a}-0.21\%$
test_reinforce_speed[True-None] 2.9402ms 2.6342ms 379.6179 Ops/s 379.4647 Ops/s $\color{#35bf28}+0.04\%$
test_reinforce_speed[True-backward] 8.8790ms 8.4711ms 118.0479 Ops/s 116.4434 Ops/s $\color{#35bf28}+1.38\%$
test_reinforce_speed[reduce-overhead-None] 3.2631ms 2.6376ms 379.1275 Ops/s 373.1986 Ops/s $\color{#35bf28}+1.59\%$
test_reinforce_speed[reduce-overhead-backward] 9.2434ms 8.4882ms 117.8109 Ops/s 115.3302 Ops/s $\color{#35bf28}+2.15\%$
test_iql_speed[False-None] 33.0859ms 31.7265ms 31.5194 Ops/s 30.6387 Ops/s $\color{#35bf28}+2.87\%$
test_iql_speed[False-backward] 46.3671ms 44.4595ms 22.4924 Ops/s 21.5522 Ops/s $\color{#35bf28}+4.36\%$
test_iql_speed[True-None] 13.3340ms 10.5640ms 94.6611 Ops/s 90.3835 Ops/s $\color{#35bf28}+4.73\%$
test_iql_speed[True-backward] 22.1881ms 21.0230ms 47.5671 Ops/s 45.8158 Ops/s $\color{#35bf28}+3.82\%$
test_iql_speed[reduce-overhead-None] 11.5300ms 10.4754ms 95.4620 Ops/s 91.7826 Ops/s $\color{#35bf28}+4.01\%$
test_iql_speed[reduce-overhead-backward] 22.6360ms 21.9492ms 45.5597 Ops/s 46.9016 Ops/s $\color{#d91a1a}-2.86\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.6604ms 5.0787ms 196.9013 Ops/s 198.8943 Ops/s $\color{#d91a1a}-1.00\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7363ms 0.5046ms 1.9818 KOps/s 1.9518 KOps/s $\color{#35bf28}+1.54\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8129ms 0.4802ms 2.0826 KOps/s 2.0919 KOps/s $\color{#d91a1a}-0.44\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.5372ms 5.1359ms 194.7061 Ops/s 211.3779 Ops/s $\textbf{\color{#d91a1a}-7.89\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.3714ms 0.5244ms 1.9069 KOps/s 2.0274 KOps/s $\textbf{\color{#d91a1a}-5.95\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8299ms 0.4787ms 2.0890 KOps/s 2.1314 KOps/s $\color{#d91a1a}-1.99\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.9070ms 1.6117ms 620.4778 Ops/s 613.7985 Ops/s $\color{#35bf28}+1.09\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.1430ms 1.5786ms 633.4728 Ops/s 635.8989 Ops/s $\color{#d91a1a}-0.38\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.3493ms 5.0178ms 199.2894 Ops/s 208.3155 Ops/s $\color{#d91a1a}-4.33\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1931ms 0.6400ms 1.5624 KOps/s 1.5766 KOps/s $\color{#d91a1a}-0.90\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0786ms 0.6091ms 1.6418 KOps/s 1.5592 KOps/s $\textbf{\color{#35bf28}+5.29\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.2429ms 4.6350ms 215.7504 Ops/s 213.3996 Ops/s $\color{#35bf28}+1.10\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.4273s 1.0511ms 951.3641 Ops/s 1.9855 KOps/s $\textbf{\color{#d91a1a}-52.08\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7106ms 0.4782ms 2.0914 KOps/s 1.9930 KOps/s $\color{#35bf28}+4.93\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.3281ms 4.5973ms 217.5181 Ops/s 203.7328 Ops/s $\textbf{\color{#35bf28}+6.77\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.0490ms 0.4927ms 2.0297 KOps/s 1.9699 KOps/s $\color{#35bf28}+3.04\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7745ms 0.4731ms 2.1137 KOps/s 2.1208 KOps/s $\color{#d91a1a}-0.34\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.4557ms 4.8834ms 204.7764 Ops/s 203.1810 Ops/s $\color{#35bf28}+0.79\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0339ms 0.6292ms 1.5894 KOps/s 1.5605 KOps/s $\color{#35bf28}+1.85\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.3410ms 0.6337ms 1.5779 KOps/s 1.6522 KOps/s $\color{#d91a1a}-4.50\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.2736ms 4.5214ms 221.1728 Ops/s 253.6053 Ops/s $\textbf{\color{#d91a1a}-12.79\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 9.1479ms 2.3336ms 428.5281 Ops/s 442.7077 Ops/s $\color{#d91a1a}-3.20\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.9272ms 1.2830ms 779.4089 Ops/s 749.4887 Ops/s $\color{#35bf28}+3.99\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.3656s 11.4679ms 87.2000 Ops/s 38.2964 Ops/s $\textbf{\color{#35bf28}+127.70\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 3.3419ms 2.1648ms 461.9371 Ops/s 422.9371 Ops/s $\textbf{\color{#35bf28}+9.22\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.4575ms 1.4020ms 713.2607 Ops/s 740.0434 Ops/s $\color{#d91a1a}-3.62\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 6.3756ms 4.3326ms 230.8068 Ops/s 225.0504 Ops/s $\color{#35bf28}+2.56\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 6.7722ms 2.3992ms 416.8086 Ops/s 398.9694 Ops/s $\color{#35bf28}+4.47\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 5.3622ms 1.4547ms 687.4278 Ops/s 683.9574 Ops/s $\color{#35bf28}+0.51\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 12.8310ms 11.6168ms 86.0823 Ops/s 86.6136 Ops/s $\color{#d91a1a}-0.61\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 20.0343ms 15.1305ms 66.0919 Ops/s 66.1744 Ops/s $\color{#d91a1a}-0.12\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 20.4717ms 20.1308ms 49.6751 Ops/s 49.3885 Ops/s $\color{#35bf28}+0.58\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.4128ms 15.1817ms 65.8687 Ops/s 64.9727 Ops/s $\color{#35bf28}+1.38\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.6894ms 20.0806ms 49.7992 Ops/s 49.8274 Ops/s $\color{#d91a1a}-0.06\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.6103ms 16.4674ms 60.7259 Ops/s 60.6927 Ops/s $\color{#35bf28}+0.05\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}10$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7168s 0.7155s 1.3977 Ops/s 1.3628 Ops/s $\color{#35bf28}+2.57\%$
test_transformed 0.9709s 0.9698s 1.0311 Ops/s 1.0285 Ops/s $\color{#35bf28}+0.25\%$
test_serial 2.1237s 2.1205s 0.4716 Ops/s 0.4780 Ops/s $\color{#d91a1a}-1.33\%$
test_parallel 1.9954s 1.9078s 0.5242 Ops/s 0.5108 Ops/s $\color{#35bf28}+2.62\%$
test_step_mdp_speed[True-True-True-True-True] 0.1826ms 39.2222μs 25.4958 KOps/s 25.7667 KOps/s $\color{#d91a1a}-1.05\%$
test_step_mdp_speed[True-True-True-True-False] 50.6510μs 22.5471μs 44.3517 KOps/s 44.5896 KOps/s $\color{#d91a1a}-0.53\%$
test_step_mdp_speed[True-True-True-False-True] 49.1900μs 21.9003μs 45.6615 KOps/s 47.9527 KOps/s $\color{#d91a1a}-4.78\%$
test_step_mdp_speed[True-True-True-False-False] 35.9800μs 12.5814μs 79.4826 KOps/s 80.2172 KOps/s $\color{#d91a1a}-0.92\%$
test_step_mdp_speed[True-True-False-True-True] 74.8610μs 41.4738μs 24.1116 KOps/s 24.3139 KOps/s $\color{#d91a1a}-0.83\%$
test_step_mdp_speed[True-True-False-True-False] 53.2300μs 24.8037μs 40.3165 KOps/s 42.4787 KOps/s $\textbf{\color{#d91a1a}-5.09\%}$
test_step_mdp_speed[True-True-False-False-True] 57.1210μs 24.3463μs 41.0740 KOps/s 42.6006 KOps/s $\color{#d91a1a}-3.58\%$
test_step_mdp_speed[True-True-False-False-False] 44.1200μs 14.8801μs 67.2039 KOps/s 68.9983 KOps/s $\color{#d91a1a}-2.60\%$
test_step_mdp_speed[True-False-True-True-True] 79.6910μs 44.1680μs 22.6408 KOps/s 23.1572 KOps/s $\color{#d91a1a}-2.23\%$
test_step_mdp_speed[True-False-True-True-False] 56.2710μs 26.8336μs 37.2668 KOps/s 37.4243 KOps/s $\color{#d91a1a}-0.42\%$
test_step_mdp_speed[True-False-True-False-True] 51.9810μs 23.5686μs 42.4293 KOps/s 42.0926 KOps/s $\color{#35bf28}+0.80\%$
test_step_mdp_speed[True-False-True-False-False] 40.3700μs 14.7190μs 67.9393 KOps/s 68.4134 KOps/s $\color{#d91a1a}-0.69\%$
test_step_mdp_speed[True-False-False-True-True] 75.3710μs 45.8938μs 21.7894 KOps/s 21.8028 KOps/s $\color{#d91a1a}-0.06\%$
test_step_mdp_speed[True-False-False-True-False] 55.9610μs 29.1178μs 34.3432 KOps/s 34.7468 KOps/s $\color{#d91a1a}-1.16\%$
test_step_mdp_speed[True-False-False-False-True] 52.8310μs 25.9705μs 38.5053 KOps/s 38.6997 KOps/s $\color{#d91a1a}-0.50\%$
test_step_mdp_speed[True-False-False-False-False] 53.2110μs 16.9309μs 59.0638 KOps/s 59.1976 KOps/s $\color{#d91a1a}-0.23\%$
test_step_mdp_speed[False-True-True-True-True] 76.3210μs 43.8407μs 22.8099 KOps/s 22.6920 KOps/s $\color{#35bf28}+0.52\%$
test_step_mdp_speed[False-True-True-True-False] 63.9000μs 26.9439μs 37.1141 KOps/s 37.3679 KOps/s $\color{#d91a1a}-0.68\%$
test_step_mdp_speed[False-True-True-False-True] 57.6600μs 28.0207μs 35.6879 KOps/s 35.9244 KOps/s $\color{#d91a1a}-0.66\%$
test_step_mdp_speed[False-True-True-False-False] 43.2000μs 16.4131μs 60.9268 KOps/s 60.6669 KOps/s $\color{#35bf28}+0.43\%$
test_step_mdp_speed[False-True-False-True-True] 80.5410μs 45.7186μs 21.8729 KOps/s 21.9943 KOps/s $\color{#d91a1a}-0.55\%$
test_step_mdp_speed[False-True-False-True-False] 60.2010μs 29.3107μs 34.1172 KOps/s 34.7235 KOps/s $\color{#d91a1a}-1.75\%$
test_step_mdp_speed[False-True-False-False-True] 3.2883ms 30.1663μs 33.1496 KOps/s 34.4698 KOps/s $\color{#d91a1a}-3.83\%$
test_step_mdp_speed[False-True-False-False-False] 51.1400μs 18.5947μs 53.7787 KOps/s 55.2517 KOps/s $\color{#d91a1a}-2.67\%$
test_step_mdp_speed[False-False-True-True-True] 75.7610μs 48.6242μs 20.5659 KOps/s 20.9883 KOps/s $\color{#d91a1a}-2.01\%$
test_step_mdp_speed[False-False-True-True-False] 60.0410μs 31.8085μs 31.4381 KOps/s 32.2551 KOps/s $\color{#d91a1a}-2.53\%$
test_step_mdp_speed[False-False-True-False-True] 55.9510μs 30.0045μs 33.3283 KOps/s 33.7611 KOps/s $\color{#d91a1a}-1.28\%$
test_step_mdp_speed[False-False-True-False-False] 46.2200μs 18.4534μs 54.1905 KOps/s 54.0086 KOps/s $\color{#35bf28}+0.34\%$
test_step_mdp_speed[False-False-False-True-True] 87.3410μs 50.4170μs 19.8346 KOps/s 20.0794 KOps/s $\color{#d91a1a}-1.22\%$
test_step_mdp_speed[False-False-False-True-False] 67.6810μs 34.2608μs 29.1879 KOps/s 29.8147 KOps/s $\color{#d91a1a}-2.10\%$
test_step_mdp_speed[False-False-False-False-True] 65.8410μs 31.3359μs 31.9123 KOps/s 31.9759 KOps/s $\color{#d91a1a}-0.20\%$
test_step_mdp_speed[False-False-False-False-False] 49.8610μs 20.4694μs 48.8534 KOps/s 48.1951 KOps/s $\color{#35bf28}+1.37\%$
test_values[generalized_advantage_estimate-True-True] 25.5287ms 25.1213ms 39.8068 Ops/s 41.7483 Ops/s $\color{#d91a1a}-4.65\%$
test_values[vec_generalized_advantage_estimate-True-True] 96.1948ms 2.8219ms 354.3672 Ops/s 335.9041 Ops/s $\textbf{\color{#35bf28}+5.50\%}$
test_values[td0_return_estimate-False-False] 0.1044ms 78.3308μs 12.7664 KOps/s 12.8883 KOps/s $\color{#d91a1a}-0.95\%$
test_values[td1_return_estimate-False-False] 55.9660ms 55.4031ms 18.0495 Ops/s 18.7767 Ops/s $\color{#d91a1a}-3.87\%$
test_values[vec_td1_return_estimate-False-False] 1.3965ms 1.0789ms 926.8726 Ops/s 936.3577 Ops/s $\color{#d91a1a}-1.01\%$
test_values[td_lambda_return_estimate-True-False] 88.1469ms 87.5970ms 11.4159 Ops/s 11.8493 Ops/s $\color{#d91a1a}-3.66\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3761ms 1.0770ms 928.4828 Ops/s 935.3224 Ops/s $\color{#d91a1a}-0.73\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.6966ms 24.5459ms 40.7400 Ops/s 42.2668 Ops/s $\color{#d91a1a}-3.61\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0330ms 0.7504ms 1.3326 KOps/s 1.3576 KOps/s $\color{#d91a1a}-1.84\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7968ms 0.6655ms 1.5027 KOps/s 1.5301 KOps/s $\color{#d91a1a}-1.79\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5375ms 1.4815ms 674.9727 Ops/s 683.1310 Ops/s $\color{#d91a1a}-1.19\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7617ms 0.6958ms 1.4372 KOps/s 1.4949 KOps/s $\color{#d91a1a}-3.86\%$
test_dqn_speed[False-None] 7.0613ms 1.5111ms 661.7620 Ops/s 676.8168 Ops/s $\color{#d91a1a}-2.22\%$
test_dqn_speed[False-backward] 2.2933ms 2.1132ms 473.2177 Ops/s 483.2586 Ops/s $\color{#d91a1a}-2.08\%$
test_dqn_speed[True-None] 0.6563ms 0.5341ms 1.8722 KOps/s 1.8005 KOps/s $\color{#35bf28}+3.98\%$
test_dqn_speed[True-backward] 1.2510ms 1.1910ms 839.6557 Ops/s 827.5086 Ops/s $\color{#35bf28}+1.47\%$
test_dqn_speed[reduce-overhead-None] 0.6202ms 0.5556ms 1.8000 KOps/s 1.7978 KOps/s $\color{#35bf28}+0.12\%$
test_dqn_speed[reduce-overhead-backward] 1.0972ms 1.0604ms 943.0673 Ops/s 933.6002 Ops/s $\color{#35bf28}+1.01\%$
test_ddpg_speed[False-None] 3.1550ms 2.8563ms 350.1001 Ops/s 355.6278 Ops/s $\color{#d91a1a}-1.55\%$
test_ddpg_speed[False-backward] 4.6194ms 4.1899ms 238.6674 Ops/s 240.8251 Ops/s $\color{#d91a1a}-0.90\%$
test_ddpg_speed[True-None] 1.1789ms 1.0825ms 923.7902 Ops/s 930.3595 Ops/s $\color{#d91a1a}-0.71\%$
test_ddpg_speed[True-backward] 2.3822ms 2.2935ms 436.0112 Ops/s 439.7306 Ops/s $\color{#d91a1a}-0.85\%$
test_ddpg_speed[reduce-overhead-None] 1.1512ms 1.0865ms 920.3719 Ops/s 922.9999 Ops/s $\color{#d91a1a}-0.28\%$
test_ddpg_speed[reduce-overhead-backward] 1.8022ms 1.7626ms 567.3504 Ops/s 567.5121 Ops/s $\color{#d91a1a}-0.03\%$
test_sac_speed[False-None] 8.4038ms 7.9965ms 125.0553 Ops/s 126.4163 Ops/s $\color{#d91a1a}-1.08\%$
test_sac_speed[False-backward] 11.5040ms 11.0760ms 90.2850 Ops/s 91.9108 Ops/s $\color{#d91a1a}-1.77\%$
test_sac_speed[True-None] 1.6191ms 1.5411ms 648.8731 Ops/s 649.8829 Ops/s $\color{#d91a1a}-0.16\%$
test_sac_speed[True-backward] 3.5303ms 3.2726ms 305.5636 Ops/s 295.4835 Ops/s $\color{#35bf28}+3.41\%$
test_sac_speed[reduce-overhead-None] 23.3652ms 12.5735ms 79.5325 Ops/s 80.0655 Ops/s $\color{#d91a1a}-0.67\%$
test_sac_speed[reduce-overhead-backward] 1.3809ms 1.3224ms 756.2226 Ops/s 673.0928 Ops/s $\textbf{\color{#35bf28}+12.35\%}$
test_redq_speed[False-None] 8.1751ms 7.4589ms 134.0676 Ops/s 134.3618 Ops/s $\color{#d91a1a}-0.22\%$
test_redq_speed[False-backward] 12.0292ms 11.1680ms 89.5419 Ops/s 87.3768 Ops/s $\color{#35bf28}+2.48\%$
test_redq_speed[True-None] 2.0814ms 2.0010ms 499.7588 Ops/s 502.1517 Ops/s $\color{#d91a1a}-0.48\%$
test_redq_speed[True-backward] 4.0596ms 3.6680ms 272.6315 Ops/s 274.9332 Ops/s $\color{#d91a1a}-0.84\%$
test_redq_speed[reduce-overhead-None] 2.0850ms 2.0056ms 498.6070 Ops/s 502.8539 Ops/s $\color{#d91a1a}-0.84\%$
test_redq_speed[reduce-overhead-backward] 3.8202ms 3.7107ms 269.4924 Ops/s 260.0282 Ops/s $\color{#35bf28}+3.64\%$
test_redq_deprec_speed[False-None] 9.8250ms 9.0141ms 110.9372 Ops/s 109.4389 Ops/s $\color{#35bf28}+1.37\%$
test_redq_deprec_speed[False-backward] 12.3348ms 11.8840ms 84.1467 Ops/s 81.8604 Ops/s $\color{#35bf28}+2.79\%$
test_redq_deprec_speed[True-None] 2.5151ms 2.3433ms 426.7543 Ops/s 431.5276 Ops/s $\color{#d91a1a}-1.11\%$
test_redq_deprec_speed[True-backward] 4.0683ms 3.9996ms 250.0254 Ops/s 241.3100 Ops/s $\color{#35bf28}+3.61\%$
test_redq_deprec_speed[reduce-overhead-None] 2.4346ms 2.3260ms 429.9276 Ops/s 431.7166 Ops/s $\color{#d91a1a}-0.41\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.4550ms 4.0293ms 248.1842 Ops/s 238.4851 Ops/s $\color{#35bf28}+4.07\%$
test_td3_speed[False-None] 7.9743ms 7.9053ms 126.4977 Ops/s 128.4631 Ops/s $\color{#d91a1a}-1.53\%$
test_td3_speed[False-backward] 10.6505ms 10.1943ms 98.0937 Ops/s 97.7170 Ops/s $\color{#35bf28}+0.39\%$
test_td3_speed[True-None] 1.6155ms 1.5955ms 626.7684 Ops/s 627.8296 Ops/s $\color{#d91a1a}-0.17\%$
test_td3_speed[True-backward] 3.2373ms 3.1276ms 319.7379 Ops/s 305.9091 Ops/s $\color{#35bf28}+4.52\%$
test_td3_speed[reduce-overhead-None] 51.3082ms 26.1981ms 38.1706 Ops/s 36.6764 Ops/s $\color{#35bf28}+4.07\%$
test_td3_speed[reduce-overhead-backward] 1.3282ms 1.2691ms 787.9820 Ops/s 699.0984 Ops/s $\textbf{\color{#35bf28}+12.71\%}$
test_cql_speed[False-None] 17.4195ms 16.8183ms 59.4590 Ops/s 60.1890 Ops/s $\color{#d91a1a}-1.21\%$
test_cql_speed[False-backward] 22.2734ms 21.7590ms 45.9580 Ops/s 45.4892 Ops/s $\color{#35bf28}+1.03\%$
test_cql_speed[True-None] 3.0643ms 2.9704ms 336.6546 Ops/s 338.8759 Ops/s $\color{#d91a1a}-0.66\%$
test_cql_speed[True-backward] 5.5054ms 5.0992ms 196.1103 Ops/s 189.2876 Ops/s $\color{#35bf28}+3.60\%$
test_cql_speed[reduce-overhead-None] 21.5089ms 13.2118ms 75.6899 Ops/s 76.4738 Ops/s $\color{#d91a1a}-1.02\%$
test_cql_speed[reduce-overhead-backward] 1.7360ms 1.6740ms 597.3813 Ops/s 596.3497 Ops/s $\color{#35bf28}+0.17\%$
test_a2c_speed[False-None] 3.3123ms 3.1787ms 314.5947 Ops/s 315.9185 Ops/s $\color{#d91a1a}-0.42\%$
test_a2c_speed[False-backward] 6.8343ms 6.2478ms 160.0567 Ops/s 161.9516 Ops/s $\color{#d91a1a}-1.17\%$
test_a2c_speed[True-None] 1.1747ms 1.0086ms 991.4729 Ops/s 992.5403 Ops/s $\color{#d91a1a}-0.11\%$
test_a2c_speed[True-backward] 2.8759ms 2.8153ms 355.1972 Ops/s 362.5508 Ops/s $\color{#d91a1a}-2.03\%$
test_a2c_speed[reduce-overhead-None] 21.2306ms 11.4094ms 87.6474 Ops/s 86.5322 Ops/s $\color{#35bf28}+1.29\%$
test_a2c_speed[reduce-overhead-backward] 1.0603ms 0.9661ms 1.0351 KOps/s 890.1636 Ops/s $\textbf{\color{#35bf28}+16.28\%}$
test_ppo_speed[False-None] 4.0977ms 3.7045ms 269.9389 Ops/s 277.2435 Ops/s $\color{#d91a1a}-2.63\%$
test_ppo_speed[False-backward] 7.3229ms 6.8496ms 145.9930 Ops/s 146.6954 Ops/s $\color{#d91a1a}-0.48\%$
test_ppo_speed[True-None] 1.3448ms 0.9641ms 1.0373 KOps/s 1.0338 KOps/s $\color{#35bf28}+0.34\%$
test_ppo_speed[True-backward] 2.8711ms 2.7414ms 364.7718 Ops/s 368.5062 Ops/s $\color{#d91a1a}-1.01\%$
test_ppo_speed[reduce-overhead-None] 0.5781ms 0.5049ms 1.9805 KOps/s 1.9160 KOps/s $\color{#35bf28}+3.37\%$
test_ppo_speed[reduce-overhead-backward] 1.1287ms 1.0938ms 914.2502 Ops/s 885.7258 Ops/s $\color{#35bf28}+3.22\%$
test_reinforce_speed[False-None] 2.3671ms 2.2656ms 441.3818 Ops/s 446.3360 Ops/s $\color{#d91a1a}-1.11\%$
test_reinforce_speed[False-backward] 3.7630ms 3.3396ms 299.4413 Ops/s 301.9655 Ops/s $\color{#d91a1a}-0.84\%$
test_reinforce_speed[True-None] 0.9367ms 0.8294ms 1.2057 KOps/s 1.1899 KOps/s $\color{#35bf28}+1.33\%$
test_reinforce_speed[True-backward] 2.6184ms 2.5618ms 390.3569 Ops/s 389.6018 Ops/s $\color{#35bf28}+0.19\%$
test_reinforce_speed[reduce-overhead-None] 22.1434ms 11.7971ms 84.7668 Ops/s 87.8692 Ops/s $\color{#d91a1a}-3.53\%$
test_reinforce_speed[reduce-overhead-backward] 1.2279ms 1.1598ms 862.2017 Ops/s 872.0270 Ops/s $\color{#d91a1a}-1.13\%$
test_iql_speed[False-None] 9.7642ms 9.2415ms 108.2077 Ops/s 110.2751 Ops/s $\color{#d91a1a}-1.87\%$
test_iql_speed[False-backward] 13.6625ms 13.1071ms 76.2943 Ops/s 76.6955 Ops/s $\color{#d91a1a}-0.52\%$
test_iql_speed[True-None] 1.8868ms 1.7701ms 564.9309 Ops/s 568.9610 Ops/s $\color{#d91a1a}-0.71\%$
test_iql_speed[True-backward] 4.5240ms 4.4226ms 226.1131 Ops/s 235.0752 Ops/s $\color{#d91a1a}-3.81\%$
test_iql_speed[reduce-overhead-None] 25.4911ms 11.5134ms 86.8552 Ops/s 87.5833 Ops/s $\color{#d91a1a}-0.83\%$
test_iql_speed[reduce-overhead-backward] 1.6410ms 1.5736ms 635.4673 Ops/s 630.2371 Ops/s $\color{#35bf28}+0.83\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 8.0163ms 6.4611ms 154.7718 Ops/s 153.7156 Ops/s $\color{#35bf28}+0.69\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5374ms 0.3280ms 3.0489 KOps/s 2.8883 KOps/s $\textbf{\color{#35bf28}+5.56\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5143ms 0.2982ms 3.3530 KOps/s 3.0581 KOps/s $\textbf{\color{#35bf28}+9.64\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.5856ms 6.2067ms 161.1174 Ops/s 159.3391 Ops/s $\color{#35bf28}+1.12\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6587ms 0.3020ms 3.3110 KOps/s 3.0858 KOps/s $\textbf{\color{#35bf28}+7.30\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6097ms 0.2931ms 3.4120 KOps/s 3.5060 KOps/s $\color{#d91a1a}-2.68\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6352ms 1.4165ms 705.9698 Ops/s 808.9672 Ops/s $\textbf{\color{#d91a1a}-12.73\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.6106ms 1.3711ms 729.3222 Ops/s 840.0891 Ops/s $\textbf{\color{#d91a1a}-13.19\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.7569ms 6.3970ms 156.3238 Ops/s 155.5541 Ops/s $\color{#35bf28}+0.49\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.3360ms 0.5022ms 1.9913 KOps/s 2.3212 KOps/s $\textbf{\color{#d91a1a}-14.21\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8377ms 0.4763ms 2.0996 KOps/s 2.4349 KOps/s $\textbf{\color{#d91a1a}-13.77\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.4861ms 6.2079ms 161.0862 Ops/s 160.2563 Ops/s $\color{#35bf28}+0.52\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.9140ms 0.3906ms 2.5600 KOps/s 2.7559 KOps/s $\textbf{\color{#d91a1a}-7.11\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 1.3230ms 0.3051ms 3.2774 KOps/s 3.0599 KOps/s $\textbf{\color{#35bf28}+7.11\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3803ms 6.1612ms 162.3072 Ops/s 160.4016 Ops/s $\color{#35bf28}+1.19\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.8460ms 0.2970ms 3.3667 KOps/s 2.5330 KOps/s $\textbf{\color{#35bf28}+32.91\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5817ms 0.2998ms 3.3357 KOps/s 2.8417 KOps/s $\textbf{\color{#35bf28}+17.39\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5038ms 6.3737ms 156.8945 Ops/s 156.3173 Ops/s $\color{#35bf28}+0.37\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.1680ms 0.5339ms 1.8728 KOps/s 2.0222 KOps/s $\textbf{\color{#d91a1a}-7.38\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7948ms 0.5349ms 1.8695 KOps/s 2.5694 KOps/s $\textbf{\color{#d91a1a}-27.24\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.1318ms 5.2851ms 189.2099 Ops/s 191.7704 Ops/s $\color{#d91a1a}-1.34\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 11.2780ms 2.0986ms 476.5105 Ops/s 443.4699 Ops/s $\textbf{\color{#35bf28}+7.45\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 2.0574ms 1.1151ms 896.7508 Ops/s 857.0205 Ops/s $\color{#35bf28}+4.64\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 8.5569ms 5.3554ms 186.7285 Ops/s 192.9394 Ops/s $\color{#d91a1a}-3.22\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.4990s 11.9084ms 83.9746 Ops/s 434.4573 Ops/s $\textbf{\color{#d91a1a}-80.67\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 16.1438ms 1.5067ms 663.7014 Ops/s 909.6354 Ops/s $\textbf{\color{#d91a1a}-27.04\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 8.0380ms 5.5733ms 179.4261 Ops/s 32.4193 Ops/s $\textbf{\color{#35bf28}+453.45\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 8.4992ms 2.1973ms 455.1127 Ops/s 456.2192 Ops/s $\color{#d91a1a}-0.24\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 8.0003ms 1.4388ms 695.0032 Ops/s 704.2492 Ops/s $\color{#d91a1a}-1.31\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 14.1141ms 13.4313ms 74.4532 Ops/s 74.1737 Ops/s $\color{#35bf28}+0.38\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.8363ms 17.4006ms 57.4691 Ops/s 58.8227 Ops/s $\color{#d91a1a}-2.30\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.3334ms 17.8580ms 55.9972 Ops/s 54.8097 Ops/s $\color{#35bf28}+2.17\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.6256ms 17.5537ms 56.9680 Ops/s 57.5664 Ops/s $\color{#d91a1a}-1.04\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 17.6608ms 17.5144ms 57.0959 Ops/s 54.7513 Ops/s $\color{#35bf28}+4.28\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.0312ms 19.0844ms 52.3988 Ops/s 53.2070 Ops/s $\color{#d91a1a}-1.52\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants