Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dead GUC enable_geqo in CBDB, but not in Postgres #657

Open
1 of 2 tasks
avamingli opened this issue Oct 9, 2024 · 12 comments
Open
1 of 2 tasks

Dead GUC enable_geqo in CBDB, but not in Postgres #657

avamingli opened this issue Oct 9, 2024 · 12 comments
Labels
type: Bug Something isn't working

Comments

@avamingli
Copy link
Contributor

avamingli commented Oct 9, 2024

Cloudberry Database version

main

What happened

Commit 9e6035c bring back enable_geqo, but the GUC is still not used in CBDB:
https://github.com/search?q=repo%3Acloudberrydb%2Fcloudberrydb+enable_geqo&type=code

In Postgres, it surely be used in planner:
https://github.com/search?q=repo%3Apostgres%2Fpostgres%20enable_geqo&type=code

Though it's put in codes, but still a dead GUC in codes and users can not set it when using CBDB.
Why GPDB/CBDB doesn't use enable_geqo unlike Postgres, shall we use it in codes or remove it entirely if there is any consideration by GPDB?

What you think should happen instead

No response

How to reproduce

NULL

Operating System

All

Anything else

No response

Are you willing to submit PR?

  • Yes, I am willing to submit a PR!

Code of Conduct

@avamingli avamingli added the type: Bug Something isn't working label Oct 9, 2024
@roseduan
Copy link
Contributor

roseduan commented Oct 9, 2024

This commit 9e6035c describes very clear that it is only for extension compatibility sake.

@avamingli
Copy link
Contributor Author

avamingli commented Oct 9, 2024

This commit 9e6035c describes very clear that it is only for extension compatibility sake.

Yeah, my question is

Why GPDB/CBDB doesn't use enable_geqo unlike Postgres, shall we use it in codes or remove it entirely if there is any consideration by GPDB?

@gfphoenix78
Copy link
Contributor

gfphoenix78 commented Oct 21, 2024

This commit 9e6035c describes very clear that it is only for extension compatibility sake.

Yeah, my question is

Why GPDB/CBDB doesn't use enable_geqo unlike Postgres, shall we use it in codes or remove it entirely if there is any consideration by GPDB?

+1. If the kernel doesn't know the GUC, it could be defined in the extension.
I doubt if the GUC is used by two or more extensions.

@gfphoenix78
Copy link
Contributor

@reshke If the GUC is used only by the extension, why not define the GUC in the extension?

@reshke
Copy link
Contributor

reshke commented Oct 21, 2024

Why GPDB/CBDB doesn't use enable_geqo unlike Postgres, shall we use it in codes or remove it entirely if there is any consideration by GPDB?

I dont know the reasons and im purely unaware of them. But this looks like very-very old decision made and early as Greenplum 5 or even earlier. I did not find any discussion nor reasons in opensource.

@reshke If the GUC is used only by the extension, why not define the GUC in the extension?

Because this way you need to modify extension to that this extension is compatible with CBDB. So, for every extension that uses enable_geqo GUC, and that you want to use with CBDB, you need to do some modification. In my opinion is in purely beneficial to avoid that.

@gfphoenix78
Copy link
Contributor

gfphoenix78 commented Oct 21, 2024

Why GPDB/CBDB doesn't use enable_geqo unlike Postgres, shall we use it in codes or remove it entirely if there is any consideration by GPDB?

I dont know the reasons and im purely unaware of them. But this looks like very-very old decision made and early as Greenplum 5 or even earlier. I did not find any discussion nor reasons in opensource.

@reshke If the GUC is used only by the extension, why not define the GUC in the extension?

Because this way you need to modify extension to that this extension is compatible with CBDB. So, for every extension that uses enable_geqo GUC, and that you want to use with CBDB, you need to do some modification. In my opinion is in purely beneficial to avoid that.

The common code are dead code in the kernel. Normally, the common code for extensions are general and interactive with kernel code, like frame work, hook functions etc., not just defined but unused. The one principal of the code conduct is to make kernel code clean. Let's make the adapter code in the extension, not in the kernel.

@reshke
Copy link
Contributor

reshke commented Oct 21, 2024

Why GPDB/CBDB doesn't use enable_geqo unlike Postgres, shall we use it in codes or remove it entirely if there is any consideration by GPDB?

I dont know the reasons and im purely unaware of them. But this looks like very-very old decision made and early as Greenplum 5 or even earlier. I did not find any discussion nor reasons in opensource.

@reshke If the GUC is used only by the extension, why not define the GUC in the extension?

Because this way you need to modify extension to that this extension is compatible with CBDB. So, for every extension that uses enable_geqo GUC, and that you want to use with CBDB, you need to do some modification. In my opinion is in purely beneficial to avoid that.

The common code are dead code in the kernel. Normally, the common code for extensions are general and interactive with kernel code, like frame work, hook functions etc., not just defined but unused. The one principal of the code conduct is to make kernel code clean. Let's make the adapter code in the extension, not in the kernel.

In cbdb enable_geqo value is always false, so absence of core geqo related parts is not an issue, when not used. enable_geqo just signals extension to avoid there usages. IMO it is very inconsistent decision (made by Pivotal or even earlier) to remove enable_geqo, leaving all other geqo-related gucs in place. Example: https://github.com/cloudberrydb/cloudberrydb/blob/main/src/backend/utils/misc/guc.c#L3780-L3789
Besides that, it is anyway more developer friendly to leave GUCS in kernel even with no use. Example: https://github.com/cloudberrydb/cloudberrydb/blob/main/src/backend/utils/misc/guc.c#L1219-L1228

@avamingli
Copy link
Contributor Author

avamingli commented Oct 21, 2024

The geo related GUCs are ignored since Greenplum init commit, we don't know the concern behind that , and if it's still a concern now.
I'm thinking of getting GUCs like enable_geqo back in planner codes.Because the search space could be very large if the we have many join tables, typically more than 10.
And dynamic programming algorithm will be much slower than geo, that's why Postgres use it.

@reshke
Copy link
Contributor

reshke commented Oct 21, 2024

I'm thinking of getting GUCs like enable_geqo back in planner codes.

totally agree

@avamingli
Copy link
Contributor Author

I'm thinking of getting GUCs like enable_geqo back in planner codes.

totally agree

The challenges are:
It will introduce plan diffs for ICW.
And.. Genetic Algorithm is complex, I do have a little knowledge about that in my postgraduate dissertation, but I don't have the confidence to fix bugs quickly in Database planner codes.

@reshke
Copy link
Contributor

reshke commented Oct 21, 2024

I do have a little knowledge about that in my postgraduate dissertation, but I don't have the confidence to fix bugs quickly in Database planner codes.

Thats ok, thats why enable_geqo GUC exists. If we default this GUC to false, any planner issue will be easily work-arounded. And also, if we catch some GEQO-related bugs, we can interact on issue using pgsql-hackers, huh? In cases, when bug is reproducible in pure postgres

@jiaqizho
Copy link
Contributor

jiaqizho commented Oct 23, 2024

I have a question, so why we not call the DefineCustomBoolVariable (i mean define the enable_geqo in the guc.c) in CBDB, but only export it? :)
@reshke

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants