[Feature] Upgrade Python code and dependencies (gpMgmt module) #656
Replies: 4 comments 4 replies
-
Thank you for the comprehensive proposal to upgrade the Python code and dependencies in the gpMgmt module. I appreciate the detailed suggestions for improving the current codebase, including:
I also note the suggestion to transition from pygresql to psycopg3. It's worth mentioning that this transition is already being considered as part of the upcoming Greenplum merge work, which aligns well with the proposal's goals. These proposed changes would indeed address several important issues in the current codebase. However, while I acknowledge the value of these improvements, I believe we should consider a more fundamental shift in our approach to cluster management. Instead of investing time in refactoring the current Python-based system, I propose we explore an alternative cluster management mechanism that addresses the core limitations of our current approach, particularly the reliance on passwordless SSH. Proposed Alternative: Rust-based gRPC Framework I recommend developing a new cluster management framework using Rust and gRPC. This approach offers several significant advantages:
Note: While Rust and gRPC are suggested here due to their strong alignment with our needs, they are not the only options. There are several modern technologies and approaches we could consider that would provide significant improvements over the current passwordless SSH mechanism. The key is to move towards a more secure, efficient, and maintainable solution for cluster management. Key Components of the Proposed Framework
Benefits Over Python Refactoring
|
Beta Was this translation helpful? Give feedback.
-
Compared to Rust + gRPC, I personally lean towards Go + gRPC for several reasons:
|
Beta Was this translation helpful? Give feedback.
-
Despite ongoing discussions about the finer points of implementation and future plans, we can all agree that: (1) There is a clear need to enhance the Python code within gpMgmt; (2) CBDB's current reliance on passwordless SSH and remote script execution for cluster management is outdated and poses significant security risks; a more modern approach is necessary; (3) The importance of maintaining the usability and stability of gpMgmt tools cannot be overstated, as they are used daily by CBDB users. |
Beta Was this translation helpful? Give feedback.
-
I suggest we can start by addressing the first issue: upgrading the Python version and dependencies, refactoring the code, and adding more tests to the gpMgmt module. @SaelKimberly would you like to submit some PRs? |
Beta Was this translation helpful? Give feedback.
-
Description
Hello everyone.
I am a python developer, and I would like to discuss the situation with
gpMgmt
. There are several problems that I think should be solved:cli
module, and management utilities can be called as follows:And all available commands will be listed with:
Ruff
utility (the most advanced linter for Python), and clarify input data types and fields in all functions and classes, in accordance with best practices.star
imports (from module import *
) and absolute imports, excluding Python system modules and dependency packages.Use case/motivation
I see the need for such changes for several reasons:
gpMgmt
module written using Python 2.x, which is absolutely unacceptable from the point of view of support and security. Newer versions of Python are fundamentally unable to run this code, and no vulnerabilities can be fixed. In fact, as of today, the minimum supported version of Python is 3.9, and Python 3.8 receives security updates only in the form of code. This automatically introduces a restriction on some key dependencies, so for example,PyGreSQL
cannot be lower than version 6.x.cloudberry
, but as a Python developer, I understand that this system needs to be carefully refined before I can be sure that it will work correctly in production. I am especially concerned that the code of this quality manages the entire system.I can probably start developing an alternative control module. Basically, it will come down to a neat code migration, and therefore will not take much time. However, some parts will probably require the implementation of large changes in the code base. Also, I would like to switch from using
pygresql
topsycopg3
, which is adapted to use on most systems.Related issues
No response
Are you willing to submit a PR?
Beta Was this translation helpful? Give feedback.
All reactions