Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix MPS sync in device transform #2061

Merged
merged 1 commit into from
Apr 7, 2024
Merged

[BugFix] Fix MPS sync in device transform #2061

merged 1 commit into from
Apr 7, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Apr 7, 2024

No description provided.

Copy link

pytorch-bot bot commented Apr 7, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2061

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 21 Unrelated Failures

As of commit b18448d with merge base f85da4c (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 7, 2024
@vmoens vmoens added the bug Something isn't working label Apr 7, 2024
@vmoens vmoens merged commit 4488c25 into main Apr 7, 2024
27 of 44 checks passed
@vmoens vmoens deleted the fix-mps-sync branch April 7, 2024 15:08
Copy link

github-actions bot commented Apr 7, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 91. Improved: $\large\color{#35bf28}3$. Worsened: $\large\color{#d91a1a}6$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 54.4488ms 53.7567ms 18.6023 Ops/s 18.5956 Ops/s $\color{#35bf28}+0.04\%$
test_sync 40.7471ms 31.2544ms 31.9955 Ops/s 33.6957 Ops/s $\textbf{\color{#d91a1a}-5.05\%}$
test_async 61.5601ms 26.5140ms 37.7159 Ops/s 37.4011 Ops/s $\color{#35bf28}+0.84\%$
test_simple 0.3309s 0.3295s 3.0345 Ops/s 3.0774 Ops/s $\color{#d91a1a}-1.39\%$
test_transformed 0.4833s 0.4799s 2.0836 Ops/s 2.0724 Ops/s $\color{#35bf28}+0.54\%$
test_serial 1.2626s 1.2031s 0.8312 Ops/s 0.8337 Ops/s $\color{#d91a1a}-0.30\%$
test_parallel 1.0862s 1.0008s 0.9992 Ops/s 1.0086 Ops/s $\color{#d91a1a}-0.93\%$
test_step_mdp_speed[True-True-True-True-True] 0.1449ms 21.4087μs 46.7099 KOps/s 46.9394 KOps/s $\color{#d91a1a}-0.49\%$
test_step_mdp_speed[True-True-True-True-False] 59.3540μs 13.0669μs 76.5290 KOps/s 77.0917 KOps/s $\color{#d91a1a}-0.73\%$
test_step_mdp_speed[True-True-True-False-True] 36.9390μs 12.5254μs 79.8378 KOps/s 79.5348 KOps/s $\color{#35bf28}+0.38\%$
test_step_mdp_speed[True-True-True-False-False] 29.3650μs 7.7904μs 128.3626 KOps/s 131.0800 KOps/s $\color{#d91a1a}-2.07\%$
test_step_mdp_speed[True-True-False-True-True] 50.8050μs 23.0209μs 43.4388 KOps/s 43.9971 KOps/s $\color{#d91a1a}-1.27\%$
test_step_mdp_speed[True-True-False-True-False] 34.5440μs 14.3557μs 69.6586 KOps/s 70.0330 KOps/s $\color{#d91a1a}-0.53\%$
test_step_mdp_speed[True-True-False-False-True] 82.5950μs 13.8037μs 72.4446 KOps/s 73.3274 KOps/s $\color{#d91a1a}-1.20\%$
test_step_mdp_speed[True-True-False-False-False] 38.8630μs 8.9338μs 111.9350 KOps/s 113.5555 KOps/s $\color{#d91a1a}-1.43\%$
test_step_mdp_speed[True-False-True-True-True] 58.4990μs 24.2402μs 41.2538 KOps/s 41.6103 KOps/s $\color{#d91a1a}-0.86\%$
test_step_mdp_speed[True-False-True-True-False] 57.3870μs 15.7366μs 63.5463 KOps/s 64.2368 KOps/s $\color{#d91a1a}-1.07\%$
test_step_mdp_speed[True-False-True-False-True] 34.3840μs 13.8637μs 72.1308 KOps/s 73.2961 KOps/s $\color{#d91a1a}-1.59\%$
test_step_mdp_speed[True-False-True-False-False] 31.7590μs 8.9005μs 112.3534 KOps/s 113.4345 KOps/s $\color{#d91a1a}-0.95\%$
test_step_mdp_speed[True-False-False-True-True] 53.1090μs 25.4716μs 39.2594 KOps/s 39.8189 KOps/s $\color{#d91a1a}-1.41\%$
test_step_mdp_speed[True-False-False-True-False] 51.6360μs 16.9666μs 58.9392 KOps/s 59.7928 KOps/s $\color{#d91a1a}-1.43\%$
test_step_mdp_speed[True-False-False-False-True] 37.8610μs 15.1381μs 66.0586 KOps/s 67.2573 KOps/s $\color{#d91a1a}-1.78\%$
test_step_mdp_speed[True-False-False-False-False] 35.5460μs 10.0907μs 99.1015 KOps/s 99.2289 KOps/s $\color{#d91a1a}-0.13\%$
test_step_mdp_speed[False-True-True-True-True] 69.9910μs 24.4050μs 40.9752 KOps/s 41.7336 KOps/s $\color{#d91a1a}-1.82\%$
test_step_mdp_speed[False-True-True-True-False] 40.7970μs 15.9438μs 62.7203 KOps/s 64.3130 KOps/s $\color{#d91a1a}-2.48\%$
test_step_mdp_speed[False-True-True-False-True] 50.3140μs 16.1155μs 62.0521 KOps/s 63.2472 KOps/s $\color{#d91a1a}-1.89\%$
test_step_mdp_speed[False-True-True-False-False] 30.2870μs 10.2207μs 97.8409 KOps/s 99.2308 KOps/s $\color{#d91a1a}-1.40\%$
test_step_mdp_speed[False-True-False-True-True] 39.0330μs 25.3420μs 39.4602 KOps/s 39.6122 KOps/s $\color{#d91a1a}-0.38\%$
test_step_mdp_speed[False-True-False-True-False] 63.8800μs 17.0109μs 58.7859 KOps/s 59.8818 KOps/s $\color{#d91a1a}-1.83\%$
test_step_mdp_speed[False-True-False-False-True] 44.9850μs 17.2400μs 58.0048 KOps/s 58.3863 KOps/s $\color{#d91a1a}-0.65\%$
test_step_mdp_speed[False-True-False-False-False] 43.2920μs 11.3026μs 88.4753 KOps/s 88.9589 KOps/s $\color{#d91a1a}-0.54\%$
test_step_mdp_speed[False-False-True-True-True] 62.9570μs 26.8231μs 37.2813 KOps/s 37.8518 KOps/s $\color{#d91a1a}-1.51\%$
test_step_mdp_speed[False-False-True-True-False] 48.5810μs 18.4546μs 54.1870 KOps/s 55.5427 KOps/s $\color{#d91a1a}-2.44\%$
test_step_mdp_speed[False-False-True-False-True] 49.6730μs 17.2578μs 57.9447 KOps/s 58.6263 KOps/s $\color{#d91a1a}-1.16\%$
test_step_mdp_speed[False-False-True-False-False] 32.6310μs 11.3208μs 88.3327 KOps/s 89.3970 KOps/s $\color{#d91a1a}-1.19\%$
test_step_mdp_speed[False-False-False-True-True] 63.4990μs 27.4091μs 36.4842 KOps/s 36.4095 KOps/s $\color{#35bf28}+0.21\%$
test_step_mdp_speed[False-False-False-True-False] 64.2000μs 19.3033μs 51.8046 KOps/s 52.0551 KOps/s $\color{#d91a1a}-0.48\%$
test_step_mdp_speed[False-False-False-False-True] 38.2020μs 18.1465μs 55.1070 KOps/s 55.0688 KOps/s $\color{#35bf28}+0.07\%$
test_step_mdp_speed[False-False-False-False-False] 36.7890μs 12.3893μs 80.7148 KOps/s 80.8625 KOps/s $\color{#d91a1a}-0.18\%$
test_values[generalized_advantage_estimate-True-True] 10.7725ms 9.3239ms 107.2511 Ops/s 110.3589 Ops/s $\color{#d91a1a}-2.82\%$
test_values[vec_generalized_advantage_estimate-True-True] 38.2962ms 35.0547ms 28.5268 Ops/s 30.0333 Ops/s $\textbf{\color{#d91a1a}-5.02\%}$
test_values[td0_return_estimate-False-False] 0.2309ms 0.1706ms 5.8600 KOps/s 5.9944 KOps/s $\color{#d91a1a}-2.24\%$
test_values[td1_return_estimate-False-False] 25.6635ms 23.3179ms 42.8856 Ops/s 43.3350 Ops/s $\color{#d91a1a}-1.04\%$
test_values[vec_td1_return_estimate-False-False] 36.7060ms 35.3473ms 28.2907 Ops/s 29.9045 Ops/s $\textbf{\color{#d91a1a}-5.40\%}$
test_values[td_lambda_return_estimate-True-False] 36.5404ms 33.5692ms 29.7892 Ops/s 29.9209 Ops/s $\color{#d91a1a}-0.44\%$
test_values[vec_td_lambda_return_estimate-True-False] 36.4818ms 35.1998ms 28.4093 Ops/s 29.8113 Ops/s $\color{#d91a1a}-4.70\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 10.1584ms 8.1635ms 122.4967 Ops/s 123.9113 Ops/s $\color{#d91a1a}-1.14\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.2658ms 2.0157ms 496.0947 Ops/s 505.5063 Ops/s $\color{#d91a1a}-1.86\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4108ms 0.3495ms 2.8616 KOps/s 2.8576 KOps/s $\color{#35bf28}+0.14\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 47.7642ms 46.4531ms 21.5271 Ops/s 25.0815 Ops/s $\textbf{\color{#d91a1a}-14.17\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.5643ms 3.0383ms 329.1328 Ops/s 331.6181 Ops/s $\color{#d91a1a}-0.75\%$
test_dqn_speed 7.2291ms 1.3527ms 739.2487 Ops/s 750.1960 Ops/s $\color{#d91a1a}-1.46\%$
test_ddpg_speed 2.9492ms 2.6726ms 374.1690 Ops/s 376.7566 Ops/s $\color{#d91a1a}-0.69\%$
test_sac_speed 9.9393ms 8.1799ms 122.2512 Ops/s 122.5202 Ops/s $\color{#d91a1a}-0.22\%$
test_redq_speed 13.7463ms 13.0626ms 76.5543 Ops/s 76.9267 Ops/s $\color{#d91a1a}-0.48\%$
test_redq_deprec_speed 13.9632ms 13.0018ms 76.9122 Ops/s 77.6151 Ops/s $\color{#d91a1a}-0.91\%$
test_td3_speed 14.7985ms 8.1470ms 122.7451 Ops/s 122.9131 Ops/s $\color{#d91a1a}-0.14\%$
test_cql_speed 37.9096ms 36.1076ms 27.6950 Ops/s 27.5870 Ops/s $\color{#35bf28}+0.39\%$
test_a2c_speed 8.8849ms 7.5134ms 133.0960 Ops/s 137.5723 Ops/s $\color{#d91a1a}-3.25\%$
test_ppo_speed 8.7557ms 7.5678ms 132.1382 Ops/s 131.7655 Ops/s $\color{#35bf28}+0.28\%$
test_reinforce_speed 7.4450ms 6.5254ms 153.2484 Ops/s 154.1358 Ops/s $\color{#d91a1a}-0.58\%$
test_iql_speed 33.7137ms 32.3208ms 30.9399 Ops/s 31.1437 Ops/s $\color{#d91a1a}-0.65\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.4987ms 2.2037ms 453.7830 Ops/s 454.9846 Ops/s $\color{#d91a1a}-0.26\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9745ms 0.4998ms 2.0007 KOps/s 1.8105 KOps/s $\textbf{\color{#35bf28}+10.51\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7297ms 0.4806ms 2.0808 KOps/s 2.1195 KOps/s $\color{#d91a1a}-1.82\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 2.4302ms 2.1483ms 465.4864 Ops/s 442.3655 Ops/s $\textbf{\color{#35bf28}+5.23\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9903ms 0.4922ms 2.0318 KOps/s 2.0313 KOps/s $\color{#35bf28}+0.02\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6656ms 0.4660ms 2.1460 KOps/s 2.1177 KOps/s $\color{#35bf28}+1.34\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.4127ms 1.2158ms 822.4796 Ops/s 823.5397 Ops/s $\color{#d91a1a}-0.13\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.4977ms 1.1524ms 867.7867 Ops/s 867.1852 Ops/s $\color{#35bf28}+0.07\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 3.9177ms 2.3719ms 421.6047 Ops/s 434.8040 Ops/s $\color{#d91a1a}-3.04\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0784ms 0.6156ms 1.6244 KOps/s 1.6167 KOps/s $\color{#35bf28}+0.47\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7692ms 0.5950ms 1.6807 KOps/s 1.6977 KOps/s $\color{#d91a1a}-1.00\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 3.3005ms 2.1815ms 458.3961 Ops/s 455.5777 Ops/s $\color{#35bf28}+0.62\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.0021ms 0.4966ms 2.0138 KOps/s 2.0136 KOps/s $+0.01\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7245ms 0.4738ms 2.1105 KOps/s 2.0963 KOps/s $\color{#35bf28}+0.68\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 2.5379ms 2.2256ms 449.3246 Ops/s 434.0100 Ops/s $\color{#35bf28}+3.53\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6566ms 0.4888ms 2.0458 KOps/s 2.0279 KOps/s $\color{#35bf28}+0.89\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3.6676ms 0.4693ms 2.1309 KOps/s 2.1150 KOps/s $\color{#35bf28}+0.76\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 2.4636ms 2.2618ms 442.1225 Ops/s 412.6049 Ops/s $\textbf{\color{#35bf28}+7.15\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.2243ms 0.6228ms 1.6055 KOps/s 1.6205 KOps/s $\color{#d91a1a}-0.92\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9306ms 0.5875ms 1.7020 KOps/s 1.6650 KOps/s $\color{#35bf28}+2.22\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 96.9173ms 7.0448ms 141.9487 Ops/s 138.8005 Ops/s $\color{#35bf28}+2.27\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 14.3972ms 12.0339ms 83.0983 Ops/s 82.6892 Ops/s $\color{#35bf28}+0.49\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.8415ms 1.1039ms 905.9162 Ops/s 951.6990 Ops/s $\color{#d91a1a}-4.81\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 83.6081ms 6.8073ms 146.9008 Ops/s 141.0413 Ops/s $\color{#35bf28}+4.15\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 14.3676ms 11.9699ms 83.5431 Ops/s 83.9303 Ops/s $\color{#d91a1a}-0.46\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 4.1892ms 1.1599ms 862.1703 Ops/s 963.6168 Ops/s $\textbf{\color{#d91a1a}-10.53\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 88.2678ms 5.6846ms 175.9126 Ops/s 175.1812 Ops/s $\color{#35bf28}+0.42\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 14.8823ms 12.3514ms 80.9626 Ops/s 80.8565 Ops/s $\color{#35bf28}+0.13\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 4.6210ms 1.4650ms 682.5937 Ops/s 741.5229 Ops/s $\textbf{\color{#d91a1a}-7.95\%}$

Copy link

github-actions bot commented Apr 7, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 94. Improved: $\large\color{#35bf28}7$. Worsened: $\large\color{#d91a1a}5$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 98.3261ms 97.5599ms 10.2501 Ops/s 9.6805 Ops/s $\textbf{\color{#35bf28}+5.88\%}$
test_sync 87.5215ms 86.8031ms 11.5203 Ops/s 11.3518 Ops/s $\color{#35bf28}+1.48\%$
test_async 0.1671s 83.5741ms 11.9654 Ops/s 14.1845 Ops/s $\textbf{\color{#d91a1a}-15.64\%}$
test_single_pixels 0.1817s 0.1161s 8.6118 Ops/s 9.1529 Ops/s $\textbf{\color{#d91a1a}-5.91\%}$
test_sync_pixels 0.1253s 69.4839ms 14.3918 Ops/s 14.3941 Ops/s $\color{#d91a1a}-0.02\%$
test_async_pixels 0.1205s 60.4100ms 16.5535 Ops/s 16.4201 Ops/s $\color{#35bf28}+0.81\%$
test_simple 0.7294s 0.6674s 1.4984 Ops/s 1.4886 Ops/s $\color{#35bf28}+0.66\%$
test_transformed 0.9383s 0.8821s 1.1336 Ops/s 1.1371 Ops/s $\color{#d91a1a}-0.30\%$
test_serial 2.1130s 2.0528s 0.4871 Ops/s 0.4902 Ops/s $\color{#d91a1a}-0.62\%$
test_parallel 1.8357s 1.7690s 0.5653 Ops/s 0.5608 Ops/s $\color{#35bf28}+0.81\%$
test_step_mdp_speed[True-True-True-True-True] 0.1507ms 32.2907μs 30.9687 KOps/s 30.5418 KOps/s $\color{#35bf28}+1.40\%$
test_step_mdp_speed[True-True-True-True-False] 66.8610μs 19.0165μs 52.5860 KOps/s 51.1607 KOps/s $\color{#35bf28}+2.79\%$
test_step_mdp_speed[True-True-True-False-True] 41.6110μs 18.4952μs 54.0681 KOps/s 54.2888 KOps/s $\color{#d91a1a}-0.41\%$
test_step_mdp_speed[True-True-True-False-False] 33.0410μs 10.9106μs 91.6541 KOps/s 88.9879 KOps/s $\color{#35bf28}+3.00\%$
test_step_mdp_speed[True-True-False-True-True] 60.9810μs 34.2533μs 29.1942 KOps/s 29.0741 KOps/s $\color{#35bf28}+0.41\%$
test_step_mdp_speed[True-True-False-True-False] 47.6100μs 21.0029μs 47.6125 KOps/s 46.5477 KOps/s $\color{#35bf28}+2.29\%$
test_step_mdp_speed[True-True-False-False-True] 39.4210μs 19.9458μs 50.1359 KOps/s 49.0156 KOps/s $\color{#35bf28}+2.29\%$
test_step_mdp_speed[True-True-False-False-False] 31.3310μs 12.7055μs 78.7058 KOps/s 76.0459 KOps/s $\color{#35bf28}+3.50\%$
test_step_mdp_speed[True-False-True-True-True] 64.7110μs 35.5979μs 28.0915 KOps/s 27.2241 KOps/s $\color{#35bf28}+3.19\%$
test_step_mdp_speed[True-False-True-True-False] 48.3110μs 22.7264μs 44.0018 KOps/s 42.4887 KOps/s $\color{#35bf28}+3.56\%$
test_step_mdp_speed[True-False-True-False-True] 94.4610μs 20.1086μs 49.7299 KOps/s 49.2887 KOps/s $\color{#35bf28}+0.90\%$
test_step_mdp_speed[True-False-True-False-False] 30.7110μs 12.7090μs 78.6841 KOps/s 75.9183 KOps/s $\color{#35bf28}+3.64\%$
test_step_mdp_speed[True-False-False-True-True] 68.5010μs 37.4832μs 26.6786 KOps/s 26.1936 KOps/s $\color{#35bf28}+1.85\%$
test_step_mdp_speed[True-False-False-True-False] 85.1910μs 24.2063μs 41.3116 KOps/s 39.7502 KOps/s $\color{#35bf28}+3.93\%$
test_step_mdp_speed[True-False-False-False-True] 46.1100μs 21.7605μs 45.9549 KOps/s 44.7511 KOps/s $\color{#35bf28}+2.69\%$
test_step_mdp_speed[True-False-False-False-False] 34.6100μs 14.4781μs 69.0701 KOps/s 66.7130 KOps/s $\color{#35bf28}+3.53\%$
test_step_mdp_speed[False-True-True-True-True] 58.7010μs 36.2140μs 27.6136 KOps/s 27.3391 KOps/s $\color{#35bf28}+1.00\%$
test_step_mdp_speed[False-True-True-True-False] 41.8100μs 22.6915μs 44.0693 KOps/s 42.5050 KOps/s $\color{#35bf28}+3.68\%$
test_step_mdp_speed[False-True-True-False-True] 46.4900μs 23.9667μs 41.7245 KOps/s 41.3963 KOps/s $\color{#35bf28}+0.79\%$
test_step_mdp_speed[False-True-True-False-False] 91.6220μs 14.6372μs 68.3191 KOps/s 66.5753 KOps/s $\color{#35bf28}+2.62\%$
test_step_mdp_speed[False-True-False-True-True] 72.0100μs 37.8626μs 26.4113 KOps/s 25.9902 KOps/s $\color{#35bf28}+1.62\%$
test_step_mdp_speed[False-True-False-True-False] 48.0900μs 24.6930μs 40.4972 KOps/s 39.4195 KOps/s $\color{#35bf28}+2.73\%$
test_step_mdp_speed[False-True-False-False-True] 46.3300μs 25.5208μs 39.1837 KOps/s 38.0819 KOps/s $\color{#35bf28}+2.89\%$
test_step_mdp_speed[False-True-False-False-False] 65.4310μs 16.5109μs 60.5660 KOps/s 59.3134 KOps/s $\color{#35bf28}+2.11\%$
test_step_mdp_speed[False-False-True-True-True] 69.4010μs 39.8953μs 25.0656 KOps/s 24.9444 KOps/s $\color{#35bf28}+0.49\%$
test_step_mdp_speed[False-False-True-True-False] 51.0910μs 26.3589μs 37.9379 KOps/s 36.3646 KOps/s $\color{#35bf28}+4.33\%$
test_step_mdp_speed[False-False-True-False-True] 63.0510μs 25.4064μs 39.3601 KOps/s 38.6789 KOps/s $\color{#35bf28}+1.76\%$
test_step_mdp_speed[False-False-True-False-False] 39.9400μs 16.3731μs 61.0759 KOps/s 59.4220 KOps/s $\color{#35bf28}+2.78\%$
test_step_mdp_speed[False-False-False-True-True] 59.6210μs 40.7835μs 24.5197 KOps/s 23.9891 KOps/s $\color{#35bf28}+2.21\%$
test_step_mdp_speed[False-False-False-True-False] 52.3410μs 28.2508μs 35.3972 KOps/s 34.3519 KOps/s $\color{#35bf28}+3.04\%$
test_step_mdp_speed[False-False-False-False-True] 48.8310μs 26.9787μs 37.0663 KOps/s 35.9718 KOps/s $\color{#35bf28}+3.04\%$
test_step_mdp_speed[False-False-False-False-False] 43.7720μs 18.3875μs 54.3848 KOps/s 53.8738 KOps/s $\color{#35bf28}+0.95\%$
test_values[generalized_advantage_estimate-True-True] 24.7304ms 23.5613ms 42.4425 Ops/s 44.0717 Ops/s $\color{#d91a1a}-3.70\%$
test_values[vec_generalized_advantage_estimate-True-True] 82.8893ms 3.1996ms 312.5350 Ops/s 312.4223 Ops/s $\color{#35bf28}+0.04\%$
test_values[td0_return_estimate-False-False] 94.5020μs 62.0270μs 16.1220 KOps/s 16.2501 KOps/s $\color{#d91a1a}-0.79\%$
test_values[td1_return_estimate-False-False] 52.1591ms 49.8369ms 20.0654 Ops/s 20.4682 Ops/s $\color{#d91a1a}-1.97\%$
test_values[vec_td1_return_estimate-False-False] 2.0505ms 1.7338ms 576.7542 Ops/s 579.5462 Ops/s $\color{#d91a1a}-0.48\%$
test_values[td_lambda_return_estimate-True-False] 80.9005ms 79.7395ms 12.5408 Ops/s 12.8288 Ops/s $\color{#d91a1a}-2.24\%$
test_values[vec_td_lambda_return_estimate-True-False] 2.0973ms 1.7262ms 579.3019 Ops/s 579.5764 Ops/s $\color{#d91a1a}-0.05\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 23.4444ms 22.1703ms 45.1054 Ops/s 45.7196 Ops/s $\color{#d91a1a}-1.34\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 0.8547ms 0.6756ms 1.4801 KOps/s 1.5005 KOps/s $\color{#d91a1a}-1.36\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.6732ms 0.6198ms 1.6133 KOps/s 1.6208 KOps/s $\color{#d91a1a}-0.46\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6087ms 1.4312ms 698.7074 Ops/s 702.4859 Ops/s $\color{#d91a1a}-0.54\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.9241ms 0.6426ms 1.5561 KOps/s 1.5863 KOps/s $\color{#d91a1a}-1.90\%$
test_dqn_speed 1.6416ms 1.3754ms 727.0695 Ops/s 682.9712 Ops/s $\textbf{\color{#35bf28}+6.46\%}$
test_ddpg_speed 3.2096ms 2.6565ms 376.4356 Ops/s 370.4446 Ops/s $\color{#35bf28}+1.62\%$
test_sac_speed 8.4061ms 7.8773ms 126.9478 Ops/s 124.7112 Ops/s $\color{#35bf28}+1.79\%$
test_redq_speed 11.2679ms 10.2480ms 97.5803 Ops/s 97.2531 Ops/s $\color{#35bf28}+0.34\%$
test_redq_deprec_speed 12.2850ms 11.3063ms 88.4464 Ops/s 89.4091 Ops/s $\color{#d91a1a}-1.08\%$
test_td3_speed 8.3262ms 7.8576ms 127.2661 Ops/s 126.1425 Ops/s $\color{#35bf28}+0.89\%$
test_cql_speed 0.1044s 26.5435ms 37.6740 Ops/s 40.1627 Ops/s $\textbf{\color{#d91a1a}-6.20\%}$
test_a2c_speed 5.9148ms 5.1410ms 194.5161 Ops/s 184.5642 Ops/s $\textbf{\color{#35bf28}+5.39\%}$
test_ppo_speed 5.6925ms 5.4406ms 183.8020 Ops/s 176.0966 Ops/s $\color{#35bf28}+4.38\%$
test_reinforce_speed 5.0323ms 4.1565ms 240.5848 Ops/s 223.4896 Ops/s $\textbf{\color{#35bf28}+7.65\%}$
test_iql_speed 19.6979ms 18.7908ms 53.2176 Ops/s 52.2047 Ops/s $\color{#35bf28}+1.94\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 3.1028ms 2.8739ms 347.9618 Ops/s 347.6749 Ops/s $\color{#35bf28}+0.08\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.3217ms 0.5411ms 1.8480 KOps/s 1.8542 KOps/s $\color{#d91a1a}-0.33\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6982ms 0.5198ms 1.9236 KOps/s 1.9215 KOps/s $\color{#35bf28}+0.11\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.1172ms 2.8651ms 349.0257 Ops/s 345.4794 Ops/s $\color{#35bf28}+1.03\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.3278ms 0.5318ms 1.8806 KOps/s 1.8745 KOps/s $\color{#35bf28}+0.32\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7623ms 0.5093ms 1.9635 KOps/s 1.9131 KOps/s $\color{#35bf28}+2.63\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6119ms 1.4178ms 705.3332 Ops/s 693.8239 Ops/s $\color{#35bf28}+1.66\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5238ms 1.3517ms 739.8034 Ops/s 730.2267 Ops/s $\color{#35bf28}+1.31\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 3.1057ms 2.9624ms 337.5660 Ops/s 334.4367 Ops/s $\color{#35bf28}+0.94\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8809ms 0.6569ms 1.5224 KOps/s 1.3351 KOps/s $\textbf{\color{#35bf28}+14.03\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.1093s 0.7308ms 1.3684 KOps/s 1.5712 KOps/s $\textbf{\color{#d91a1a}-12.91\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 3.0807ms 2.8713ms 348.2734 Ops/s 346.4348 Ops/s $\color{#35bf28}+0.53\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7485ms 0.5428ms 1.8422 KOps/s 1.8543 KOps/s $\color{#d91a1a}-0.65\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 4.2522ms 0.5195ms 1.9250 KOps/s 1.9303 KOps/s $\color{#d91a1a}-0.27\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.1084ms 2.8919ms 345.7875 Ops/s 345.5795 Ops/s $\color{#35bf28}+0.06\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.1054s 0.6824ms 1.4654 KOps/s 1.8954 KOps/s $\textbf{\color{#d91a1a}-22.69\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7567ms 0.5145ms 1.9435 KOps/s 1.9530 KOps/s $\color{#d91a1a}-0.49\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 3.1987ms 3.0054ms 332.7390 Ops/s 332.7765 Ops/s $\color{#d91a1a}-0.01\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8348ms 0.6627ms 1.5089 KOps/s 1.5170 KOps/s $\color{#d91a1a}-0.53\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 4.5664ms 0.6432ms 1.5546 KOps/s 1.5723 KOps/s $\color{#d91a1a}-1.12\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1068s 8.8123ms 113.4779 Ops/s 110.5397 Ops/s $\color{#35bf28}+2.66\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 16.3418ms 13.9949ms 71.4546 Ops/s 71.1407 Ops/s $\color{#35bf28}+0.44\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.9181ms 1.0623ms 941.3711 Ops/s 870.7161 Ops/s $\textbf{\color{#35bf28}+8.11\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1026s 6.7367ms 148.4415 Ops/s 149.2564 Ops/s $\color{#d91a1a}-0.55\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 16.2122ms 13.9093ms 71.8945 Ops/s 71.0328 Ops/s $\color{#35bf28}+1.21\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.0866ms 1.0985ms 910.3604 Ops/s 782.4122 Ops/s $\textbf{\color{#35bf28}+16.35\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1034s 9.0252ms 110.8008 Ops/s 110.9907 Ops/s $\color{#d91a1a}-0.17\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 16.5519ms 14.2206ms 70.3203 Ops/s 69.7036 Ops/s $\color{#35bf28}+0.88\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.4270ms 1.4255ms 701.5096 Ops/s 684.3118 Ops/s $\color{#35bf28}+2.51\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants