Optimize AsyncVideoDecoder for Better Memory + Switch to Bounded Channels (Possible memory leak fix) #122
+213
−214
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request: Optimize AsyncVideoDecoder for Better Memory Management and Performance
Changes Made
Added new constants for cache management:
Introduced
CachedFrame
struct for better frame management:Changed cache data structure from
BTreeMap
toVecDeque
:Implemented new cache management methods:
Modified frame decoding and caching logic:
frame_to_send
variable to store the frame to be sent.Moved sender.send() outside of all loops:
Explanation
The main issues addressed in this update are related to memory management, performance optimization, and potential undefined behavior due to improper use of the oneshot sender.
Prevention of Memory Leaks and Improved Performance
Better Cache Management: The new
CachedFrame
struct andVecDeque
data structure allow for more efficient frame caching and access patterns. Thecleanup_cache
andaggressive_cleanup
methods ensure that the cache doesn't grow beyond specified limits, preventing potential memory leaks.Single Use of Sender: By moving the
sender.send()
call outside of all loops, we ensure that the sender is used exactly once per request. This prevents potential memory leaks that could occur if frames were sent multiple times or if the sender was dropped without being used.Efficient Frame Caching: The
frame_to_send
variable allows us to store the required frame as soon as it's decoded, without immediately sending it. This approach prevents unnecessary decoding of frames after the target frame is found, reducing memory usage and improving performance.Periodic Cache Cleanup: The implementation of periodic cache cleanup helps maintain a reasonable memory footprint over time, preventing the accumulation of unused frames in memory.
Problems Addressed
Memory Leaks: The previous implementation risked accumulating too many frames in memory without a proper cleanup mechanism. The new cache management system addresses this.
Inefficient Caching: The old
BTreeMap
implementation was less efficient for the access patterns typically seen in video decoding. The newVecDeque
structure is more appropriate for this use case.Potential Undefined Behavior: The previous code had multiple points where the oneshot sender could be used, risking using it more than once or not at all in some code paths. The new implementation ensures consistent, single use of the sender.
Performance Issues: The old code continued decoding frames even after finding the requested frame. The new implementation breaks the decoding loop as soon as the required frame is found, improving performance.
Conclusion
These changes significantly improve the robustness, efficiency, and safety of the
AsyncVideoDecoder
. By implementing better cache management, ensuring proper use of the oneshot sender, and optimizing the frame decoding process, we've mitigated the risk of memory leaks, improved overall performance, and enhanced the reliability of the video decoding system.Change Number 2:
Transition to Bounded Channels and Performance Enhancements
Changes
In
desktop/src-tauri/src/lib.rs
:Added explicit import for
mpsc
:Replaced unbounded channel with bounded channel:
In
rendering/lib.rs
:Updated imports:
Changed
render_video_to_channel
function signature:Implemented parallel frame processing using
stream::iter
andbuffer_unordered
:Added timeout for frame production:
Rationale
Memory Management: Transitioning from unbounded to bounded channels helps prevent potential memory leaks by providing backpressure.
Parallel Processing: Utilizing
buffer_unordered
withnum_cpus::get()
allows for efficient parallel processing of frames, potentially improving render speed on multi-core systems.Timeout Mechanism: Adding a timeout for frame production helps prevent the renderer from hanging indefinitely on problematic frames.
Performance Optimization: The buffer size of 180 frames (3 seconds at 60 fps) provides a good balance between smooth performance and efficient resource usage for most modern systems.
Potential Impact
Testing Recommendations
Future Considerations