Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Massive trace file? #24

Open
indigoviolet opened this issue Oct 14, 2020 · 6 comments
Open

Massive trace file? #24

indigoviolet opened this issue Oct 14, 2020 · 6 comments

Comments

@indigoviolet
Copy link

Hi Kunal,

Thanks for this library.

I'm trying it out to trace some pytorch code after striking out on pytracing and functiontrace.

One thing I noticed was that it created a massive trace file: 5.1GB that gzipped to 300M. Is that expected? (the other libraries created much smaller files). My chrome://tracing failed to load that file...

If that's unavoidable, would it make sense to have some way of restricting the depth of the trace - for example I would care about user code, imported libraries, python stdlib, python core code in decreasing order - so if I could only trace the first two that might help?

Cheers.

@kunalb
Copy link
Owner

kunalb commented Oct 15, 2020

Thanks for trying it out! Hmm, is the code shareable to see what you're doing? -- I would have expected similar order of magnitude compared to other tracers. The main additional thing it'll be capturing is connections between async function calls.

chrome://tracing will start dieing around ~1g of trace.

I have filtering planned, but no real urgency for it -- I'll move that up and implement it so that you can choose which frames show up in traces.

In the meantime, is it feasible for you to use @Probe instead to only instrument functions you explicitly care for / or use a context manager to isolate what you're digging into?

@kunalb
Copy link
Owner

kunalb commented Oct 15, 2020

(Another thing I've done in the past when I really needed to see a massive trace is to split it into multiple traces by just cutting it at arbitrary points, then loading the intermediate pieces. But this is a very painful workflow.)

@indigoviolet
Copy link
Author

I can generate and share a subset of the trace if you think that'll be helpful - the code itself isn't particularly clever or unusual, a modified version of https://detectron2.readthedocs.io/ . I'm tracing the model training for ~20 iterations. No threading or asyncio or stuff like that.

Re @Probe, I've had some success using that same approach with https://region-profiler.readthedocs.io/en/latest/, but it's obviously more work to do it upfront.

@kunalb
Copy link
Owner

kunalb commented Oct 16, 2020

If you pull master you should be able to do something like:

with AsyncioTracer(trace, skip=stdlib()):
    ...

or

with AsyncioTracer(trace, skip=not_(module_equals("yourmodulename"))):
   ...

@kunalb
Copy link
Owner

kunalb commented Oct 16, 2020

PS. how are you finding all these tracers? I spent quite a bit of time before starting panopticon, and found less than half of the ones you have.

@indigoviolet
Copy link
Author

indigoviolet commented Oct 16, 2020

Awesome, thank you so much for working on this so quickly. I'll give it a shot soon and let you know.

PS. how are you finding all these tracers?

haha - just googling I guess. python chrome tracing etc - in fact I found one of your comments on a Functiontrace HN thread that lead me here, though I had earlier seen an FB post about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants