Performance issue in utils.py (by P3) #10

DLPerf · 2021-08-27T12:49:43Z

Hello! I've found a performance issue in utils.py: .batch(MODEL_PARAMS['batch_size'] )(line 72) should be called before .map( parse_example_helper_csv, num_parallel_calls=8 )(line 46), which could make your program more efficient.

Here is the tensorflow document to support it.

Besides, you need to check the function parse_example_helper_csv called in .map( parse_example_helper_csv, num_parallel_calls=8 ) whether to be affected or not to make the changed code work properly. For example, if parse_example_helper_csv needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z) after fix.

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

The text was updated successfully, but these errors were encountered:

DSXiangLi · 2021-08-27T23:44:56Z

@DLPerf Thanks for point this out! Honestly I haven't pay much attention to performance before >< I just took a look at that performance doc, and found there are actually multiple ways to speed up the tf.data ^O^ !

speed up data transformation

sequential mapping -> parall mapping, by using the num_parallel_calls that I already used in the code
scalar mapping -> vectorized mapping, by using batch before map as you proposed

speed up data extraction: sequential extraction -> parallel extraction, by using interleave. But in order to use this, I think we need to chunk train sample into multiple tf records in advance ?
parallelize above ops with training, by using prefetch.

Maybe we can add the other 2 also? Currenly I am indeed not available to manage this repo, could you please help me fix this? Much appreciated!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance issue in utils.py (by P3) #10

Performance issue in utils.py (by P3) #10

DLPerf commented Aug 27, 2021

DSXiangLi commented Aug 27, 2021

Performance issue in utils.py (by P3) #10

Performance issue in utils.py (by P3) #10

Comments

DLPerf commented Aug 27, 2021

DSXiangLi commented Aug 27, 2021