-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: reuse the slice to optimize #5889
Conversation
@krasi-georgiev Also optimize |
After reading the code more comprehensively, I think the slice of Because If we reuse the slice in deleting, the Postings may return duplicate series ref.
|
tsdb/index/postings.go
Outdated
break | ||
} | ||
} | ||
if !found { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you remove the short circuit? The short circuit doesn't allocate anymore, but still avoids extra append calls and assignment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@csmarchbanks Actually the short circuit still need to iterate the p.m[n][l] one loop and if found==true
, it need to allocate and iterate another loop to fullfill repl
. But if we reuse the p.m[n][l]
, only one loop is needed and no allocation at all under any conditions.
However as I commented above, reuse the slice will result a sneaky bug. So I'll rollback this code change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is inside of a Lock, so reusing the slice should not cause a problem.
I will rephrase my concern, the previous code did a loop with only gets in order to quickly see if anything changed. The new code only does one loop, but that loop is more expensive since it also does an append for every element that was not deleted. I am wondering if you have any benchmarks to show that doing the extra operations is ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lock only protect MemPostings.m
, but not the memory underlying it.
For example, if MemPosting.m["foo"]["bar"]
is {1, 2, 3, 4, 5} and through MemPosting.Get("foo", "bar")
we will get a ListPostings
whose list
field share the same memory with MemPosting.m["foo"]["bar"]
.
Then MemPosting.Delete()
delete some ref, 2, 4
for example. If we reuse the slice, the underlying memory will be {1, 3, 5, 4, 5}. Finally, the iterating of ListPostings
will not be correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 That make sense. Also, that is an insidious potential bug that does not fail any tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I'll delete this part of slice reuse.
And I'd like to add a test for this insidious case if necessary :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I'd like to add a test for this insidious case if necessary :)
Please do, or else I will in a separate PR. :) It doesn't look like MemPostings.Delete
has any package level testing right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another PR will be fine.
Signed-off-by: YaoZengzeng <[email protected]>
@krasi-georgiev @bwplotka @csmarchbanks It's proved that reuse the slice for But PTAL. |
@YaoZengzeng thanks, |
@krasi-georgiev Maybe the referenced functions are not called frequently, so the optimization is not obvious. But reusing the slice that way is appropriate and more efficient in theory, right? :) |
yep 100% correct, but sometime the theory is not what we see in practice. It should be simple enough to write some benchmark that shows the difference and it should be useful for future tests. |
Yes, I'd like to research it in more depth :) |
Thanks, Appreciated. waiting for the results. |
@YaoZengzeng any progress on this? |
ping @YaoZengzeng |
@krasi-georgiev Sorry for the delay, but really busy these days. 😂 I'll move on once I have time :) |
@YaoZengzeng are you still busy? any chance to look into this anytime soon? |
In the bug scrub, we decided to close this for lack of activity. If somebody wants to pick this up, we'll appreciate it. Next step, as discussed above, would be to verify the improvement with a benchmark. |
ref: prometheus-junkyard/tsdb#676
@bwplotka @krasi-georgiev
Signed-off-by: YaoZengzeng [email protected]