Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vTune Profile #140

Open
ahorek opened this issue Aug 11, 2020 · 2 comments
Open

vTune Profile #140

ahorek opened this issue Aug 11, 2020 · 2 comments

Comments

@ahorek
Copy link
Contributor

ahorek commented Aug 11, 2020

here's a profile from https://software.intel.com/content/www/us/en/develop/tools/vtune-profiler.html

measured on an intel platform (avx2 build, gcc 9.2.0)

hopefully, it might be useful for someone to investigate where the bottlenecks are

paq8px.exe -8 data.dat // default CM
overview
obrazek
paq8px.exe -8l data.dat // default CM
overview (ltsm)
obrazek
paq8px.exe -8 data.dat // default CM
tree
obrazek
paq8px.exe -8 data.dat // default CM
tree (ltsm)
obrazek

@moisespr123
Copy link
Contributor

Hi,

Please suggest this at the official paq8px thread here: https://encode.su/threads/342-paq8px/page70

@micsthepick
Copy link

I looked at the callgrind graphs for some simple files, and a lot of time is spent on APM classes initialization, and when I added some custom file caching code to prevent recalculating each set of coefficients for the given APM sizes, it significantly sped up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants