Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

package: Add --exclude option to exclude components from being started. #492

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 16 additions & 14 deletions components/clp-package-utils/clp_package_utils/scripts/start_clp.py
Original file line number Diff line number Diff line change
Expand Up @@ -993,6 +993,8 @@ def main(argv):
help="CLP package configuration file.",
)

args_parser.add_argument("--exclude", "-no", nargs="+", default=[], help="Exclude component(s) from being started.")

component_args_parser = args_parser.add_subparsers(dest="target")
component_args_parser.add_parser(CONTROLLER_TARGET_NAME)
component_args_parser.add_parser(DB_COMPONENT_NAME)
Expand All @@ -1012,10 +1014,10 @@ def main(argv):

parsed_args = args_parser.parse_args(argv[1:])

target = ALL_TARGET_NAME
exclude = parsed_args.exclude
if parsed_args.target:
target = parsed_args.target
else:
target = ALL_TARGET_NAME

try:
check_dependencies()
Expand Down Expand Up @@ -1096,37 +1098,37 @@ def main(argv):
conf_dir = clp_home / "etc"

# Start components
if target in (ALL_TARGET_NAME, DB_COMPONENT_NAME):
if target in (ALL_TARGET_NAME, DB_COMPONENT_NAME) and DB_COMPONENT_NAME not in exclude:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally don't like how the code is written now. it's quite hard to read and could be error prone. In addition, it looks like if user make a typo in the exclude list, CLP will sliently run the "exclude target"

In the closed source CLP, we have a config file to specify what components to run and one can remove component from the config file to remove components.
It might make more sense if we let the start-clp to launch everything by default (i.e. if a config file is not provided), or launch the specified component based on the config file.

If we plan to go with the current approach, I think if the if statement can be viewed as

if target in (ALL_TARGET, [TASK_TARGET]) and TASK_TARGET not in exclude:

maybe we can factor it out as a function to make the code less messy.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally don't like how the code is written now. it's quite hard to read and could be error prone. In addition, it looks like if user make a typo in the exclude list, CLP will sliently run the "exclude target"

In the closed source CLP, we have a config file to specify what components to run and one can remove component from the config file to remove components. It might make more sense if we let the start-clp to launch everything by default (i.e. if a config file is not provided), or launch the specified component based on the config file.

If we plan to go with the current approach, I think if the if statement can be viewed as

if target in (ALL_TARGET, [TASK_TARGET]) and TASK_TARGET not in exclude:

maybe we can factor it out as a function to make the code less messy.

With help from @Henry8192 , we have refactored the if-checks into calls of a checker helper. Also now we also validate whether the names in the exclude list are valid before proceeding. Let me know if this is more readable and maintainable now.

start_db(instance_id, clp_config, conf_dir)
if target in (ALL_TARGET_NAME, CONTROLLER_TARGET_NAME, DB_COMPONENT_NAME):
if target in (ALL_TARGET_NAME, CONTROLLER_TARGET_NAME, DB_COMPONENT_NAME) and CONTROLLER_TARGET_NAME not in exclude and DB_COMPONENT_NAME not in exclude:
create_db_tables(instance_id, clp_config, container_clp_config, mounts)
if target in (ALL_TARGET_NAME, CONTROLLER_TARGET_NAME, QUEUE_COMPONENT_NAME):
if target in (ALL_TARGET_NAME, CONTROLLER_TARGET_NAME, QUEUE_COMPONENT_NAME) and CONTROLLER_TARGET_NAME not in exclude and QUEUE_COMPONENT_NAME not in exclude:
start_queue(instance_id, clp_config)
if target in (ALL_TARGET_NAME, CONTROLLER_TARGET_NAME, REDIS_COMPONENT_NAME):
if target in (ALL_TARGET_NAME, CONTROLLER_TARGET_NAME, REDIS_COMPONENT_NAME) and CONTROLLER_TARGET_NAME not in exclude and REDIS_COMPONENT_NAME not in exclude:
start_redis(instance_id, clp_config, conf_dir)
if target in (ALL_TARGET_NAME, RESULTS_CACHE_COMPONENT_NAME):
if target in (ALL_TARGET_NAME, RESULTS_CACHE_COMPONENT_NAME) and RESULTS_CACHE_COMPONENT_NAME not in exclude:
start_results_cache(instance_id, clp_config, conf_dir)
if target in (ALL_TARGET_NAME, CONTROLLER_TARGET_NAME, RESULTS_CACHE_COMPONENT_NAME):
create_results_cache_indices(instance_id, clp_config, container_clp_config, mounts)
if target in (
ALL_TARGET_NAME,
CONTROLLER_TARGET_NAME,
COMPRESSION_SCHEDULER_COMPONENT_NAME,
):
) and CONTROLLER_TARGET_NAME not in exclude and COMPRESSION_SCHEDULER_COMPONENT_NAME not in exclude:
start_compression_scheduler(instance_id, clp_config, container_clp_config, mounts)
if target in (ALL_TARGET_NAME, CONTROLLER_TARGET_NAME, QUERY_SCHEDULER_COMPONENT_NAME):
if target in (ALL_TARGET_NAME, CONTROLLER_TARGET_NAME, QUERY_SCHEDULER_COMPONENT_NAME) and CONTROLLER_TARGET_NAME not in exclude and QUERY_SCHEDULER_COMPONENT_NAME not in exclude:
start_query_scheduler(instance_id, clp_config, container_clp_config, mounts)
if target in (ALL_TARGET_NAME, COMPRESSION_WORKER_COMPONENT_NAME):
if target in (ALL_TARGET_NAME, COMPRESSION_WORKER_COMPONENT_NAME) and COMPRESSION_WORKER_COMPONENT_NAME not in exclude:
start_compression_worker(
instance_id, clp_config, container_clp_config, num_workers, mounts
)
if target in (ALL_TARGET_NAME, QUERY_WORKER_COMPONENT_NAME):
if target in (ALL_TARGET_NAME, QUERY_WORKER_COMPONENT_NAME) and QUERY_WORKER_COMPONENT_NAME not in exclude:
start_query_worker(instance_id, clp_config, container_clp_config, num_workers, mounts)
if target in (ALL_TARGET_NAME, REDUCER_COMPONENT_NAME):
if target in (ALL_TARGET_NAME, REDUCER_COMPONENT_NAME) and REDUCER_COMPONENT_NAME not in exclude:
start_reducer(instance_id, clp_config, container_clp_config, num_workers, mounts)
if target in (ALL_TARGET_NAME, WEBUI_COMPONENT_NAME):
if target in (ALL_TARGET_NAME, WEBUI_COMPONENT_NAME) and WEBUI_COMPONENT_NAME not in exclude:
start_webui(instance_id, clp_config, mounts)
if target in (ALL_TARGET_NAME, LOG_VIEWER_WEBUI_COMPONENT_NAME):
if target in (ALL_TARGET_NAME, LOG_VIEWER_WEBUI_COMPONENT_NAME) and LOG_VIEWER_WEBUI_COMPONENT_NAME not in exclude:
start_log_viewer_webui(instance_id, clp_config, container_clp_config, mounts)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haiqi96
These look simple but are definitely less clean and scalable. I am thinking about refactoring this into a dict which stores {component_name: (method, (args1, args2))} pairs and a for-loop to go over the dict. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it's ok to have the line that launches the component. the part that introduce complexity is the if statement, especially we have some component -> COMPONENT_NAME / GROUP_MAPPING but it's not very clear what are all the mappings. So I would prefer to approach this from simplifying the if statements.

Copy link
Collaborator Author

@junhaoliao junhaoliao Jul 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I agree. Abstracting the if-condition checks into a single function call per if should keep the code simple and clean.


except Exception as ex:
Expand Down
Loading