-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
start refactoring process by setting up base + init #14306
base: main
Are you sure you want to change the base?
Conversation
cc @Rachelint |
It seems the tests will be executed twice, how about we just left the one complete test file? Because we will only move testcases incrementlly after this pr, seems we can ensure no cases are lost by this way:
And it seems great if we can make this an automatic process? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we name it old_aggregate.slt
or old_testcases.slt
.
And we can add a README to explain the background like what in string
.
(I can help, and not a required thing about merging)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the tests will be executed twice, how about we just left the one complete test file?
Because we will only move testcases incrementlly after this pr, seems we can ensure no cases are lost by this way:
- move cases from
complete_aggregate.slt
tofunction1.slt
- get
diff
between current movedcomplete_aggregate.slt
- compare
diff
andfunction1.slt
And it seems great if we can make this an automatic process?
If I understood you correctly,
My fear is that we might lose track of what we have already moved,
we might not be able to make sure that sum of all funtions_*.slt
is equal to old_aggregate.slt
or not, but for base_aggregate.slt
we know if something is present in it, it is not present in any of the functions file.
beside I can not think of test running twice as a bad thing, it is like an extra layer of security at the cost of ~5 sec of ci time(even on 1 thread).
I can maintain this extra file on my local system but that is like binding this issue to me, it will be easier for anyone to contribute in spliting this file if we keep both.
I definitely agree with making a README file, I was considering it myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with it is most improtant to keep no tests are lost and I think it ok to execute tests twice temporarily.
But I think it a bit strange if we always need to execute tests in later.
If we choose to keep and run all exists cases, it seems good that?
- Just don't modify the old cases in
aggregate.slt
, but only rename it toold_aggregate.slt
- And we ensure no new cases will be added into
old_aggregate.slt
anymore, and guide contributors to add cases in the new way(create a file for the function, and add cases into it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 The alternative may can be following? Maybe actually make sense that we should keep a complete
for ensuring no cases lost.
-
Keep the
complete_aggregate.slt
but just make it won't be executed -
Perform extract and subtract for the
base_aggregate.slt
.
For example we extractmin/max
from it, and subtract them frombase_aggregate.slt
.
And getmin_max.slt
andbase_aggregate.slt
-
Implement a simple program/script to check if
min_max.slt
+base_aggregate.slt
=complete_aggregate.slt
It seems not only aggregate
and string
but also some other test files are too large, may be it can reused during sorting out them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Keep the
complete_aggregate.slt
but just make it won't be executed- Perform extract and subtract for the
base_aggregate.slt
.
This approach looks good to me.
- Implement a simple program/script to check if
min_max.slt
+base_aggregate.slt
=complete_aggregate.slt
This looks fun to me, I will be working on this though it might take some time so, can we merge this portion in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is nice to do it in follow on prs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should I open a new issue for this or just a PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should I open a new issue for this or just a PR?
I think both of them are ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe open a sub issue of #13723 ?
And we state what we want to do in later refactor in it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @logan-keede again, it looks good to me as a start of refactoring!
fix: typo Co-authored-by: kamille <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry... I think of some situations:
- Contributors don't notice the README, and add new tests into
base_aggregate.slt
- Reviewers don't notice the README too, approve and merge the pr
- Finally, the
base_aggregate.slt
become different with the archivedcomplete_aggregate.slt
And it may be painful to solve such conflicts if it happen frequently.
Maybe we should include the check in this pr before merging. And when found new cases added into base_aggregate.slt
, we throw an error and block it in ci.
Sorry again...
…n-keede/datafusion into diff_for_sqllogictests
No problem, I have added the diff function. |
@Rachelint I have added the test to CI, |
Thanks @logan-keede , I will review it in next few days. |
Which issue does this PR close?
part of #13723
Rationale for this change
refer to #14301
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?