Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributed mode phase 1 #98

Merged
merged 47 commits into from
Jun 24, 2024
Merged

Distributed mode phase 1 #98

merged 47 commits into from
Jun 24, 2024

Conversation

iandyh
Copy link
Contributor

@iandyh iandyh commented Dec 8, 2023

This implements the issue described here: #19

In particular, this PR focuses on two main points mentioned here: #19 (comment)

We essentially split the monolith into two: api and controller.

  1. Put some isolated background task into a separate process. Please bear in mind that, Shibuya will still support non-distributed mode as in the past.
  2. Due to we will have more than 1 deployment to be managed, we will switch to Helm to manage the manifests. There are also some necessary changes:
    • Dockerfile is updated.
    • We will use GitHub Action to build the image for api and controller. GitHub action will also build the charts and release into GitHub. So later users can helm pull xxx to get the install charts.

The GitHub Action build can be seen here: https://github.com/iandyh/shibuya/actions/runs/7137516355/job/1943762659

@rsw-a
Copy link
Collaborator

rsw-a commented Dec 20, 2023

@iandyh

The GitHub Action build can be seen here: https://github.com/iandyh/shibuya/actions/runs/7137516355/job/1943762659

this returns http status 404 to me.

@iandyh
Copy link
Contributor Author

iandyh commented Dec 20, 2023

@rsw-a
Copy link
Collaborator

rsw-a commented Dec 20, 2023

@iandyh

@rsw-a hmm, could you check this one? https://github.com/iandyh/shibuya/actions/runs/7137516355/job/19437626599

yep, it works!

Copy link
Collaborator

@rsw-a rsw-a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iandyh
I've reviewed all changes. please check my comments.

one another question based on RFP, you've illustrated that shibuya release will be done by jenkins-aas pipeline based on helm introduced based on these changes. however, I was not able to see any details from ghe repos (shibuya-ci). is there any new changes for that? or you plan to do that at next phase?

.github/workflows/build-publish.yml Outdated Show resolved Hide resolved
.github/workflows/build-publish.yml Show resolved Hide resolved
.github/workflows/build-publish.yml Show resolved Hide resolved
.github/workflows/build-publish.yml Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've thought there would be no data integrity issues caused by race condition based on this separation, tho. coz all processing is about to wipe out unnecessary resources. do you have any concerns in terms of multi-threaded processing on the controller?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, that depends on how to define mutli-threaded support.
If we look at the whole system(controller, api, etc), I don't have concern as this will be the same as before just process/threads are running either as a whole(multi-thread) or separately(multi process). The underlying logic does not change.
If we look at just the controller, it should not have multiple threads/process because essentially it's just a checker with a forever loop and getting the states from either the DB or a k8s cluster. If we have another checker(by using multi-threading), we need to implementing something like worker queue to split the work, which right now it does not exist. So having another checker will essentially doing the same work and it could be problematic so we should avoid that.
In terms of the scalability of the controller, I don't have great concern for it for now because I expect it to be a light weight component(most of its time is likely spending on IO operations anyway)
Hopefully I answered your questions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understood. and i'm 80% sure what you've described above. but not 100%. so, it's better to visualize all operations which controller and api have provided (especially goroutines). otherwise it's a bit difficult to see entire picture. I'm about to do that on the confluence page, tho. it's a bit time-consuming and will take some time until illustrating everything. I'll make a review request to you piece by piece.

@iandyh iandyh force-pushed the split branch 3 times, most recently from 0938de9 to 1d17ce2 Compare February 19, 2024 02:49
@iandyh
Copy link
Contributor Author

iandyh commented Feb 20, 2024

@rsw-a I've either resolved the comment or replied to your comment. Please check. Thx.

Copy link
Collaborator

@rsw-a rsw-a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me.

NOTE: however, shibuya doesn't work at local env (macOS arm64 m1pro chip). because some components don't take arm64 architecture into account. (e.g., mariadb container image, build.sh)
it works over ubuntu VM after fixing those issues. anyway, we can address those issues later.

@iandyh
Copy link
Contributor Author

iandyh commented Apr 22, 2024

@rsw-a yes. I think I am a fix in my local branch and I’ll send it out later.

@iandyh iandyh changed the title Distributed mode [WIP] Distributed mode phase 1 May 20, 2024
@iandyh iandyh merged commit a3e0cd3 into rakutentech:master Jun 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants