-
Notifications
You must be signed in to change notification settings - Fork 504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
memory leak ($300 bounty) #278
Comments
💎 $300 bounty • Screenpi.peSteps to solve:
Thank you for contributing to mediar-ai/screenpipe! |
Hi, I'd be interested in giving this a shot if you could give me instructions on how exactly to run this to trigger the problem. |
I am (mostly) free ATM, so I would not mind fixing this leak too. @louis030195 |
@FractalFir last time? this is current process but i think the leaks command or UI is not helpful anymore, this is just 7mb leak that's why looking for other way to profile, are you on linux? i heard about this https://github.com/flamegraph-rs/flamegraph but does not work on mac to build CLI on linux:
screencapture does not work on Wayland fyi |
you can profile heap memory usage with jemalloc_pprof on Linux. Apply this diff:
install pprof (assuming you have golang installed):
get a heap pprof and analyze it with the pprof tool:
(I suggest navigating to "flamegraph" in the pprof UI) |
I meant like when I worked on fixing the leak in screencapturekit-rs. Yes, I am on Linux, currently downloading and building the CLI. I have used things like cargo-flamegraph quite a bit before, because I was dealing with high memory usage in my own projects. Well, the output still tells us a few things. None of the leaks are > 64 bytes, which suggest that the leaked object is small. It is unlikely to be a video frame / audio sample. |
what is "screencapture" ? And does this mean it is impossible to reproduce the leak on wayland? |
screenpipe take screenshots of all your windows of all your monitors continuously and do OCR + mp4 encoding to disk it also record audio continuously and do STT + mp4 encoding and it mean you cannot reproduce on wayland the vision leaks, you can reproduce audio leaks though (might break down the bounty in smaller ones if there is a leak in both audio and vision) you can disable audio or vision using |
I'm repeatedly getting this error when running (in X):
|
hmm this is another issue that nobody found how to reproduce actually #228 |
I think can replicate the leak on my machine, and it seems a bit bigger on Linux.
300 Mb in 30 seconds is quite a lot. I will be analysing the exact cause. |
keep in mind we load a whisper-large model in memory at boot (nvidia if using cuda feature and apple stuff when using metal feature or RAM+CPU) for audio transcription screenpipe/screenpipe-server/src/core.rs Line 51 in e5dcde7
also leaks show big leak at boot only every time but i cannot see the full stack for some reason in the UI and does not show in the CLI: huggingface/candle#2271 (comment) seems correlated to model loading but not sure |
Yeah, I will let it run for a bit longer to have more accurate data. I thought 1 minute would be enough to initialize everything, but giving it more time will not hurt. |
I will let in run for some more time to get a better picture of what is happening. |
This is quite a weird issue. I have run the executable under heaptrack to see the exact cause of the leak. I think the memory is leaking, but according to Heaptrack also thinks that the peak memory usage was 3.7 GB(or 4.6 including heaptrack overhead). However, this is not the case according to the memory usage metrics, which claim a higher usage. Runtime: 491s, Total Memory: 30% (4.71 GB / 15.72 GB), Total CPU: 787% So, it seems like |
I have run the program under valgrind for some time, and have some initial results.
The Directly and indirectly, lost blocks are pieces of memory valgrind knows can't be freed. However, those were allocated in C code of some Linux audio utilities, and should not be the cause of the problem. The 12.6 MB of "possibly lost" memory kind of looks like it could be the leak, but I am not sure. The thing about "possibly lost" blocks is that they could be still reachable, so false positives are not out of the question. Some things seem to suggest that at least some of the leaks you have observed are included in that "possibly lost" memory. You have said that you think you might have a leak related to model loading. This to me looks like it could be that leak:
(the bytes in the log are the total count, not the count in that leak). However, once again, those are possible leaks, and not "guranteed leaks". Also, I am not sure if the leak seen on MacOS also present on Linux. Could you try running the program under valgrind yourself? I just want to make sure the issue is present on both platforms. EDIT: it looks like valgrind is not supported on ARM Macs. :(. I guess we will need to use something different. |
Just to spare someone else the effort: |
interesting actually our resource monitor never properly recorded memory also https://github.com/mediar-ai/screenpipe/blob/main/screenpipe-server/src/resource_monitor.rs |
i need to go away for max 1h, will come back on this issue after (~2 pm here) |
also you can run these other binaries to run smaller parts btw:
|
Understandable, I too will have to go away in a few hours. I have found a way to make the leak faster, and more visible.
With those settings, the memory usage seems to grow from 0.67 GB at the start to 2GB after ~5.5 minutes.
this seems to suggest this is issue is related to video recording. However, this could also be a false positive, since video recording is resource intensive in general. |
easy way to reproduce:
|
This does not seem to work on my machine, when I run
I get:
When I run
So, I am unable to run the vison code standalone. Are you on some specific branch? |
git pull (just updated) |
I think there might be more than one or a more fundamental issue here. This still grows. |
fixed |
Also at least on linux/ubuntu with pipewire as an alsa backend, each time we list the devices, it seems to leak a bit inside pipewire. Both of these seem outside our control. |
I'll look at this more tomorrow, it's midnight here. Good luck 👋 |
It is also midnight for me, so I will too be soon heading to bed. BTW: I can't reproduce the vision leak on Linux. The memory grows for some time, but then it stabilizes. Question: could you provide the output of the leaks command for just the vision module?
I don't know if the |
i reached 12 gb after running will share leaks, do you think it could be apple native OCR? also i mostly heard of memory issues from mac users and less on windows and linux, but still some windows users found using too much memory/cpu sometimes (esp when only 16 gb ram computers or no GPU) this is the code: https://github.com/mediar-ai/screenpipe/blob/main/screenpipe-vision/src/apple.rs https://github.com/mediar-ai/screenpipe/blob/main/screenpipe-vision/src/ocr.swift leaks: https://gist.github.com/louis030195/92717aaedfde57e592bb424567aeeeb6 (note that i did a few changes right now in core.rs vision code before running leaks command which could have improved perf, just see overuse of arc and clones in the vision code that is not necessary) |
The
Well, there is a way to check if this is caused by Apple native OCR. If the leak is present with a different OCR engine, then the issue must be somewhere else. If it disappears after changing engines, then this must be related to Apple OCR. |
running screenpipe-vision with tesseract with 120 fps right now to see |
(does not seem to be the issue) |
Does just calling this function in a loop leak memory? If so, then we know that the issue is in that function and that function alone. If the issue is there, can you replicate it in swift? For example, by just passing some hardcoded image? |
OK, so it looks like the leak is there. There must be some kind of bug in the swift code, so the next logical step would be looking closer at that code to find the exact cause. I am not a swift expert, but I would suggest disabling certain parts of the swift code until the leak disappears. For example, you could check if this swift code alone:
leaks memory. If this first part leaks memory, then the issue is likely there. If calling this stub does nothing, then we know that the leak is somewhere further down the line. You can repeat this process until you find the exact cause of the leak. Sadly, I have to go now. I will take a closer look at this tommorow. |
hey everyone, i fixed the leak, doing few more test and will distribute the bounty shortly |
/tip $150 @FractalFir thanks a lot 🙏 feel free to have a look at other issues we do a bunch of bounties, also we did not have the opportunity to test much on linux unfortunately (still trying to setup a cloud desktop with audio and vision available) |
🎉🎈 @FractalFir has been awarded $150! 🎈🎊 |
@exi: You just got a $50 tip! 👉 Complete your Algora onboarding to collect your payment. |
🎉🎈 @exi has been awarded $50! 🎈🎊 |
how does screenpipe work?
previously noticed memory leaks in dependencies:
what is still to fix:
what i tried/did:
what could be helpful to try:
what i suspect is still leaking:
circular references with arc: overuse of arc without proper weak references can create reference cycles, preventing memory from being freed.
unbounded channels: using unbounded channels (e.g.,
mpsc::unbounded_channel()
) without proper backpressure can lead to memory growth if producers outpace consumers.long-running loops: continuous capture loops in vision and audio processing might accumulate data over time if not properly managed.
unmanaged file handles: repeatedly opening file handles for logging or data storage without proper closure could leak file descriptors.
spawned tasks not being cleaned up: tokio tasks that are spawned but not properly awaited or cancelled could lead to resource leaks.
large data structures in long-running processes: storing large amounts of data in memory for extended periods without proper cleanup.
improper error handling: failing to properly handle errors in async contexts might leave resources uncleaned.
caching without limits: implementing caches without size limits or eviction policies could lead to unbounded growth.
improper use of 'static lifetimes: overuse of 'static lifetimes might prevent data from being dropped when it's no longer needed.
resource-intensive callbacks: callbacks for audio or video processing that allocate memory without proper deallocation.
improper management of external resources: not properly releasing resources from external libraries or apis (e.g., ffmpeg, ocr engines).
accumulating historical data: storing historical data (e.g., previous images for comparison) without a retention policy.
inefficient string handling: repeated string allocations and concatenations in logging or data processing without reuse.
improper shutdown procedures: not properly shutting down all components and releasing resources when the application terminates.
memory fragmentation: frequent allocations and deallocations of varying sizes could lead to memory fragmentation, appearing as a "leak".
improper use of lazy_static or similar patterns: global state that grows over time without bounds.
inefficient use of buffers: repeatedly allocating new buffers for audio or video data instead of reusing existing ones.
improper handling of large files: loading large files entirely into memory instead of streaming or chunking.
unclosed streams: not properly closing audio or video streams, especially when dealing with multiple devices.
improper handling of device disconnections: not cleaning up resources when audio or video devices are disconnected unexpectedly.
wrong usage of ffmpeg maybe switch to ffmpeg-sidecar #194 would help
wrong usage of sqlite db
maybe using IPC for ffmpeg would help [stability/perf] using IPC to communicate with ffmpeg #246
something else
context:
how to reproduce:
definition of done:
cc:
bounty $300
/bounty 300
happy to jump on a call if useful or for efficiency
The text was updated successfully, but these errors were encountered: