Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a stand-alone load mode to DM. (WIP) #11738

Closed
wants to merge 4 commits into from

Conversation

OliverS929
Copy link

What problem does this PR solve?

Issue Number: close #9230

What is changed and how it works?

Primary work to add a stand-alone load mode to DM, alongside with dump mode. Load&Sync mode might not be needed anymore and could be depreciated in the future.

UT/IT added are under testing and will be pushed soon.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Questions

Will it cause performance regression or break compatibility?

N/A

Do you need to update user documentation, design documentation or monitoring documentation?

Yes. Description of those standalone modes would need to be added to docs.

Release note

Please refer to [Release Notes Language Style Guide](https://pingcap.github.io/tidb-dev-guide/contribute-to-tidb/release-notes-style-guide.html) to write a quality release note.

If you don't think this PR needs a release note then fill it with `None`.

not be needed anymore and could be depreciated in the future.
Copy link
Contributor

ti-chi-bot bot commented Nov 12, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. first-time-contributor Indicates that the PR was contributed by an external member and is a first-time contributor. labels Nov 12, 2024
Copy link
Contributor

ti-chi-bot bot commented Nov 12, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign benmeadowcroft, d3hunter for approval, ensuring that each of them provides their approval before proceeding. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added area/dm Issues or PRs related to DM. area/engine Issues or PRs related to Dataflow Engine. labels Nov 12, 2024
@sre-bot
Copy link

sre-bot commented Nov 12, 2024

CLA assistant check
All committers have signed the CLA.

@ti-chi-bot ti-chi-bot bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Nov 12, 2024
@OliverS929
Copy link
Author

/test all

@OliverS929
Copy link
Author

/retest

@OliverS929
Copy link
Author

/retest

Copy link
Contributor

ti-chi-bot bot commented Nov 13, 2024

@OliverS929: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
jenkins-ticdc/verify a323638 link true /test verify

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@OliverS929
Copy link
Author

/test engine-integration-test

@OliverS929 OliverS929 changed the title Add a stand-alone load mode to DM. (WIP) Add a stand-alone load mode to DM. Nov 14, 2024
@OliverS929 OliverS929 changed the title Add a stand-alone load mode to DM. Add a stand-alone load mode to DM. (WIP) Nov 14, 2024
@OliverS929 OliverS929 changed the title Add a stand-alone load mode to DM. (WIP) Add a stand-alone load mode to DM. (WIP) Nov 14, 2024
Comment on lines 333 to +334
// adjust dir, no need to do for load&sync mode because it needs its own s3 repository
// still use dir for standalone load mode (different from the behavior of load&sync mode)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// adjust dir, no need to do for load&sync mode because it needs its own s3 repository
// still use dir for standalone load mode (different from the behavior of load&sync mode)
// adjust dir for modes with load unit except load&sync mode because it needs its own s3 repository

BTW because I don't know how does the new "load" mode is used, I'm not sure should DM adjust the dir or not. Maybe the new mode should be similar with "load&sync"?

Copy link
Author

@OliverS929 OliverS929 Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a comment in "engine/executor/dm/worker.go" mentioned that load&sync mode has its own external storage. From my understanding,load mode usage work with LoaderConfig.Dir, so calling storage.AdjustPath is necessary in this case. also /cc @D3Hunter . Would appreciate it if you could take a look.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this further so that load and loadsync both skip the adjust part of the code and use "dir" directly.

@@ -935,7 +935,7 @@ var (
ErrConfigSyncerCfgConflict = New(codeConfigSyncerCfgConflict, ClassConfig, ScopeInternal, LevelMedium, "syncer-config-name and syncer should only specify one", "Please check the `syncer-config-name` and `syncer` config in task configuration file.")
ErrConfigReadCfgFromFile = New(codeConfigReadCfgFromFile, ClassConfig, ScopeInternal, LevelMedium, "read config file %v", "")
ErrConfigNeedUniqueTaskName = New(codeConfigNeedUniqueTaskName, ClassConfig, ScopeInternal, LevelMedium, "must specify a unique task name", "Please check the `name` config in task configuration file.")
ErrConfigInvalidTaskMode = New(codeConfigInvalidTaskMode, ClassConfig, ScopeInternal, LevelMedium, "please specify right task-mode, support `full`, `incremental`, `all`", "Please check the `task-mode` config in task configuration file.")
ErrConfigInvalidTaskMode = New(codeConfigInvalidTaskMode, ClassConfig, ScopeInternal, LevelMedium, "please specify right task-mode, support `full`, `incremental`, `all`, `dump`, `load`", "Please check the `task-mode` config in task configuration file.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we don't want to expose the two modes in dmctl use case, we shouldn't change this message

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Will change it back.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a second thought, terror is shared between openapi and dmctl. I think we might want to change this first to make sure our behavior with openapi is correct. Support of these two modes on dmctl could come a bit later. What do you think?

@@ -287,7 +287,8 @@ func (tm *TaskManager) allFinished(ctx context.Context) bool {
if runningTask.Unit != frameModel.WorkerDMLoad {
return false
}
case dmconfig.ModeDump:
case dmconfig.ModeDump, dmconfig.ModeLoad, dmconfig.ModeLoadSync:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
case dmconfig.ModeDump, dmconfig.ModeLoad, dmconfig.ModeLoadSync:
case dmconfig.ModeDump, dmconfig.ModeLoad:

sync unit will never finish

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing this out. Will change this back.

@ti-chi-bot ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 14, 2024
@OliverS929 OliverS929 closed this Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/dm Issues or PRs related to DM. area/engine Issues or PRs related to Dataflow Engine. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. first-time-contributor Indicates that the PR was contributed by an external member and is a first-time contributor. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

engine: support new dm task mode
3 participants