-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
row count estimation on DATETIME type is over-estimated when time span across years/months #50080
Comments
/found gs |
Seems the histogram estimates the range size based on the converted values of type bytes/string, and the time type is not taken into account. Even if there are no out-of-range conditions, when the time span is in the range between different years/months/days, the over-estimated rows can still be seen, like this:
|
It is still in the master (c44e991). |
To solve the issue completely, a analyze version |
Also, it's not only about the out-of-range estimation. The in-bucket estimation suffers from the problem too. But the over-estimation will not be as large as the out-of-range just because the upper bound of the estimation is the row count of that bucket. |
Another issue is that under this problem, the estimation error between non-indexed datetime and indexed datetime is too large. The current solution is to first reduce this part of the error. |
|
Bug Report
Please answer these questions before submitting your issue. Thanks!
1. Minimal reproduce step (Required)
test.zip
2. What did you expect to see? (Required)
3. What did you see instead (Required)
4. What is your TiDB version? (Required)
v6.5.3
The text was updated successfully, but these errors were encountered: