-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add asynchronous prefetch for DirectIO directory #15224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. |
private final int prefetchBytesSize; | ||
private final Deque<Long> pendingPrefetches = new ArrayDeque<>(); | ||
private final FileChannel channel; | ||
private final ExecutorService executor = Executors.newVirtualThreadPerTaskExecutor(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the executor something you would want to share within a directory or potentially even across directories? I can't find any documentation that indicates that this pattern would be a problem.
Do you have some number on throughput change? |
This is neat -- if Lucene implements enough top-down prefetch hinting, it might eventually be that DirectIO, alone, is sufficient for good query latency/throughput? I.e. we could stop entirely relying on OS to do its prefetching/caching (buffer cache), maybe, in very cold indices? Isn't |
Correct, its only used in certain scenarios. We are experimenting using it in more areas (e.g. vector rescoring, to keep from polluting the off-heap cache with rescoring vectors)
Its not quite there yet. I have seen this improve throughput by more than 2x though depending on the read patterns. MMAP still has TONS of advantages (direct memory segment access being a HUGE one for vectors). Virtual threads make this VERY easy, but I am sure there is a lot of headroom for improvements. |
I also think that NIOFS could benefit of a prefetch implementation as well. |
If you used direct io for everything you would want to introduce an explicit disk cache somewhere, even with prefetching I don't think performance would meet expectations for a lot of workloads if most reads resulted in a syscall. |
100% agreed. I think we are a long ways away from making IO super cheap. Again, MMAP has many benefits still. But virtual threads do make this way easier than it would have been before! |
This adds prefetching to directIO. The idea is pretty simple,
When doing many prefetches and handling things in batches, this can significantly improve throughput.