Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do I debug performance of deflate in libewf (ewfacquirestream)? #178

Open
JustMyGithub opened this issue Jan 12, 2023 · 6 comments
Open
Assignees
Labels

Comments

@JustMyGithub
Copy link

Hey there,

I just built the current version and I notice that compression speed is pretty slow with best (deflate) compression.
It starts at about 10MBs and after 10GB it is at about 35-40 MBs. It does not seem to matter whether I use -j 1, -j 12 or -j 50 -- ewfacquire uses about 8-10% of 6x2 cores only. The legacy-build achieves the same performance.
I do expect low performance for best compression, of course. However CPU is hardly utilized and so are source and destination drives. If the compression is parallelized (at least for the non-legacy version - why would there be a -j parameter otherwise), what is the bottleneck and can I do something about it using different configuration parameters?

Thx

@joachimmetz
Copy link
Member

I'm not able to tell, too little information about your configuration, build and the test parameters. The code is considered experimental, you could use normal code profiling tools to see what is going on.

@joachimmetz joachimmetz self-assigned this Jan 12, 2023
@joachimmetz joachimmetz changed the title Understanding performance of deflate in libewf (ewfacquirestream) How do I debug performance of deflate in libewf (ewfacquirestream)? Jan 12, 2023
@JustMyGithub
Copy link
Author

I understand that non-legacy version is experimental. I tried this version because non-legacy may not be parallelized based on my understanding of the documentation, but as outlined above the performance is about the same.

I do not have experience with code profiling for performance aspects, but I am happy to provide the information you need. I assume the most helpful information is the ./configure summary. May it be useful to use libcthreads instead of pthreads?:

`Building:
libcerror support: local
libcthreads support: local
libcdata support: local
libcdatetime support: local
libclocale support: local
libcnotify support: local
libcsplit support: local
libuna support: local
libcfile support: local
libcpath support: local
libbfio support: local
libfcache support: local
libfdata support: local
libfdatetime support: local
libfguid support: local
libfvalue support: local
ADLER32 checksum support: zlib
DEFLATE compression support: zlib
BZIP2 compression support: bzlib
libhmac support: local
MD5 support: libcrypto_evp
SHA1 support: libcrypto_evp
SHA256 support: libcrypto_evp
libcaes support: local
libodraw support: local
libsmdev support: local
libsmraw support: local
GUID/UUID support: check
FUSE support: no

Features:
Multi-threading support: pthread
Wide character type support: no
Write support: yes
ewftools are build as static executables: no
Python (pyewf) support: no
Verbose output: no
Debug output: no
`

@joachimmetz
Copy link
Member

I tried this version because non-legacy may not be parallelized based on my understanding of the documentation, but as outlined above the performance is about the same.

This should not be the case, but you might be hitting a yet unimplemented use case. Can you describe step by step:

  1. how you build the code, including attaching config.log
  2. what commands you ran, and the start of their output
  3. other details that might be relevant.

May it be useful to use libcthreads instead of pthreads?:

Libcthreads is a cross-platform wrapper around different thread implementations it uses the pthread (without s) implementation under the hood on some configurations.

@JustMyGithub
Copy link
Author

JustMyGithub commented Jan 12, 2023

legacy_config.log
non-legacy_config.log

  1. In cygwin following https://github.com/libyal/libewf/wiki/Building
    so basically:
./synclibs.sh
./autogen.sh
./configure
make

pbzip2.exe -d --stdout /cygdrive/d/testimage.bz2 | ~/libewf-legacy-main/ewftools/ewfacquirestream -c deflate:best -S 25G -l /cygdrive/k/libewf-legacy.log -d sha256 -t /cygdrive/k/image-legacy

(pbzip2 is not the bottleneck. If I use compression fast for example, the throughput is much higher; for non-legacy i used the same parameters but added -j for testing as written in previous post.; d and k are different drives and hardly utilized, so not the bottleneck either.)

ewfacquirestream 20140815

Using the following acquiry parameters:
Image path and filename:                /cygdrive/k/image-legacy.E01
Case number:                            case_number
Description:                            description
Evidence number:                        evidence_number
Examiner name:                          examiner_name
Notes:                                  notes
Media type:                             fixed disk
Is physical:                            yes
EWF file format:                        EnCase 6 (.E01)
Compression method:                     deflate
Compression level:                      best
Acquiry start offset:                   0
Number of bytes to acquire:             0 (until end of input)
Evidence segment file size:             25 GiB (26843545600 bytes)
Bytes per sector:                       512
Block size:                             64 sectors
Error granularity:                      64 sectors
Retries on read error:                  2
Zero sectors on read error:             no

Acquiry started at: Jan 11, 2023 10:47:33
This could take a while.

Status: acquired 74 MiB (78610432 bytes)
        in 4 second(s) with 18 MiB/s (19652608 bytes/second).

Status: acquired 104 MiB (109379584 bytes)
        in 8 second(s) with 13 MiB/s (13672448 bytes/second).

Status: acquired 134 MiB (140738560 bytes)
        in 12 second(s) with 11 MiB/s (11728213 bytes/second).
[...]
Status: acquired 483 GiB (519397568992 bytes)
        in 3 hour(s), 33 minute(s) and 10 second(s) with 38 MiB/s (40613857 bytes/second).

Thanks

@joachimmetz
Copy link
Member

joachimmetz commented Jan 13, 2023

20140815 is the legacy version, was more interested in the output of the experimental version. I'll see if I can reproduce this scenario when time permits

@JustMyGithub
Copy link
Author

JustMyGithub commented Jan 13, 2023

That is correct. As described above, it does not really make a difference in the output and speed (besides the version number) and I did not want to SPAM. Reproducing it however, I did find out that -j 50 defaults to -j 4 -- I did not notice that before. More than 32 jobs default to 4, so the effective maximum I tried was -12. (Wouldn't it be useful to default to max instead of default if the given parameter is > max?)

In the current test, I observed that legacy has 8-10% CPU util. and non-legacy seems to have slightly more (11-13%)

Here is the start of output for -j 32 with non-legacy

ewfacquirestream 20230101

Using the following acquiry parameters:
Image path and filename:                /cygdrive/k/image.E01
Case number:                            case_number
Description:                            description
Evidence number:                        evidence_number
Examiner name:                          examiner_name
Notes:                                  notes
Media type:                             fixed disk
Is physical:                            yes
EWF file format:                        EnCase 6 (.E01)
Compression method:                     deflate
Compression level:                      best
Acquiry start offset:                   0
Number of bytes to acquire:             0 (until end of input)
Evidence segment file size:             25 GiB (26843545600 bytes)
Bytes per sector:                       512
Block size:                             64 sectors
Error granularity:                      64 sectors
Retries on read error:                  2
Zero sectors on read error:             no

Acquiry started at: Jan 13, 2023 07:45:40
This could take a while.

Status: acquired 70 MiB (74022912 bytes)
        in 4 second(s) with 17 MiB/s (18505728 bytes/second)

Status: acquired 95 MiB (100499456 bytes)
        in 8 second(s) with 11 MiB/s (12562432 bytes/second)

Status: acquired 123 MiB (129859584 bytes)
        in 12 second(s) with 10 MiB/s (10821632 bytes/second)
[...]
Status: acquired 12 GiB (13176012800 bytes)
        in 6 minute(s) and 16 second(s) with 33 MiB/s (35042587 bytes/second)


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants