Add a stand-alone load mode to DM. (WIP) #11738

OliverS929 · 2024-11-12T09:43:53Z

What problem does this PR solve?

Issue Number: close #9230

What is changed and how it works?

Primary work to add a stand-alone load mode to DM, alongside with dump mode. Load&Sync mode might not be needed anymore and could be depreciated in the future.

UT/IT added are under testing and will be pushed soon.

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No code

Questions

Will it cause performance regression or break compatibility?

N/A

Do you need to update user documentation, design documentation or monitoring documentation?

Yes. Description of those standalone modes would need to be added to docs.

Release note

Please refer to [Release Notes Language Style Guide](https://pingcap.github.io/tidb-dev-guide/contribute-to-tidb/release-notes-style-guide.html) to write a quality release note.

If you don't think this PR needs a release note then fill it with `None`.

not be needed anymore and could be depreciated in the future.

ti-chi-bot · 2024-11-12T09:43:56Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

ti-chi-bot · 2024-11-12T09:43:57Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign benmeadowcroft, d3hunter for approval, ensuring that each of them provides their approval before proceeding. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

sre-bot · 2024-11-12T09:44:00Z

All committers have signed the CLA.

OliverS929 · 2024-11-12T09:45:24Z

/test all

OliverS929 · 2024-11-13T11:16:53Z

/retest

OliverS929 · 2024-11-13T11:55:11Z

/retest

ti-chi-bot · 2024-11-13T12:13:36Z

@OliverS929: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
jenkins-ticdc/verify	`a323638`	link	true	`/test verify`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

OliverS929 · 2024-11-13T12:41:56Z

/test engine-integration-test

lance6716 · 2024-11-14T03:35:56Z

dm/config/subtask.go

 	// adjust dir, no need to do for load&sync mode because it needs its own s3 repository
+	// still use dir for standalone load mode (different from the behavior of load&sync mode)


Suggested change

// adjust dir, no need to do for load&sync mode because it needs its own s3 repository

// still use dir for standalone load mode (different from the behavior of load&sync mode)

// adjust dir for modes with load unit except load&sync mode because it needs its own s3 repository

BTW because I don't know how does the new "load" mode is used, I'm not sure should DM adjust the dir or not. Maybe the new mode should be similar with "load&sync"?

I think a comment in "engine/executor/dm/worker.go" mentioned that load&sync mode has its own external storage. From my understanding,load mode usage work with LoaderConfig.Dir, so calling storage.AdjustPath is necessary in this case. also /cc @D3Hunter . Would appreciate it if you could take a look.

Change this further so that load and loadsync both skip the adjust part of the code and use "dir" directly.

lance6716 · 2024-11-14T03:36:47Z

dm/pkg/terror/error_list.go

@@ -935,7 +935,7 @@ var (
 	ErrConfigSyncerCfgConflict      = New(codeConfigSyncerCfgConflict, ClassConfig, ScopeInternal, LevelMedium, "syncer-config-name and syncer should only specify one", "Please check the `syncer-config-name` and `syncer` config in task configuration file.")
 	ErrConfigReadCfgFromFile        = New(codeConfigReadCfgFromFile, ClassConfig, ScopeInternal, LevelMedium, "read config file %v", "")
 	ErrConfigNeedUniqueTaskName     = New(codeConfigNeedUniqueTaskName, ClassConfig, ScopeInternal, LevelMedium, "must specify a unique task name", "Please check the `name` config in task configuration file.")
-	ErrConfigInvalidTaskMode        = New(codeConfigInvalidTaskMode, ClassConfig, ScopeInternal, LevelMedium, "please specify right task-mode, support `full`, `incremental`, `all`", "Please check the `task-mode` config in task configuration file.")
+	ErrConfigInvalidTaskMode        = New(codeConfigInvalidTaskMode, ClassConfig, ScopeInternal, LevelMedium, "please specify right task-mode, support `full`, `incremental`, `all`, `dump`, `load`", "Please check the `task-mode` config in task configuration file.")


if we don't want to expose the two modes in dmctl use case, we shouldn't change this message

Got it. Will change it back.

On a second thought, terror is shared between openapi and dmctl. I think we might want to change this first to make sure our behavior with openapi is correct. Support of these two modes on dmctl could come a bit later. What do you think?

lance6716 · 2024-11-14T03:38:05Z

engine/jobmaster/dm/task_manager.go

@@ -287,7 +287,8 @@ func (tm *TaskManager) allFinished(ctx context.Context) bool {
 			if runningTask.Unit != frameModel.WorkerDMLoad {
 				return false
 			}
-		case dmconfig.ModeDump:
+		case dmconfig.ModeDump, dmconfig.ModeLoad, dmconfig.ModeLoadSync:


Suggested change

case dmconfig.ModeDump, dmconfig.ModeLoad, dmconfig.ModeLoadSync:

case dmconfig.ModeDump, dmconfig.ModeLoad:

sync unit will never finish

Thank you for pointing this out. Will change this back.

Primary work to add a stand-alone load mode to DM. Load&Sync mode might

73620d2

not be needed anymore and could be depreciated in the future.

ti-chi-bot bot added area/dm Issues or PRs related to DM. area/engine Issues or PRs related to Dataflow Engine. labels Nov 12, 2024

ti-chi-bot bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Nov 12, 2024

Fix format.

a323638

Update error.txt

83ddceb

OliverS929 changed the title ~~Add a stand-alone load mode to DM. (WIP)~~ Add a stand-alone load mode to DM. Nov 14, 2024

OliverS929 changed the title ~~Add a stand-alone load mode to DM.~~ Add a stand-alone load mode to DM. （WIP） Nov 14, 2024

OliverS929 changed the title ~~Add a stand-alone load mode to DM. （WIP）~~ Add a stand-alone load mode to DM. (WIP) Nov 14, 2024

lance6716 reviewed Nov 14, 2024

View reviewed changes

Add load mode to openapi.

e0d9a1c

ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 14, 2024

OliverS929 mentioned this pull request Nov 14, 2024

dm: add a stand-alone load mode #11749

Open

2 tasks

OliverS929 closed this Nov 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a stand-alone load mode to DM. (WIP) #11738

Add a stand-alone load mode to DM. (WIP) #11738

OliverS929 commented Nov 12, 2024

ti-chi-bot bot commented Nov 12, 2024

ti-chi-bot bot commented Nov 12, 2024

sre-bot commented Nov 12, 2024 •

edited

Loading

OliverS929 commented Nov 12, 2024

OliverS929 commented Nov 13, 2024

OliverS929 commented Nov 13, 2024

ti-chi-bot bot commented Nov 13, 2024 •

edited

Loading

OliverS929 commented Nov 13, 2024

lance6716 Nov 14, 2024

OliverS929 Nov 14, 2024 •

edited

Loading

OliverS929 Nov 15, 2024

lance6716 Nov 14, 2024

OliverS929 Nov 14, 2024

OliverS929 Nov 14, 2024

lance6716 Nov 14, 2024

OliverS929 Nov 14, 2024

		// adjust dir, no need to do for load&sync mode because it needs its own s3 repository
		// still use dir for standalone load mode (different from the behavior of load&sync mode)

	// adjust dir, no need to do for load&sync mode because it needs its own s3 repository
	// still use dir for standalone load mode (different from the behavior of load&sync mode)
	// adjust dir for modes with load unit except load&sync mode because it needs its own s3 repository

	case dmconfig.ModeDump, dmconfig.ModeLoad, dmconfig.ModeLoadSync:
	case dmconfig.ModeDump, dmconfig.ModeLoad:

Add a stand-alone load mode to DM. (WIP) #11738

Add a stand-alone load mode to DM. (WIP) #11738

Conversation

OliverS929 commented Nov 12, 2024

What problem does this PR solve?

What is changed and how it works?

Check List

Tests

Questions

Will it cause performance regression or break compatibility?

Do you need to update user documentation, design documentation or monitoring documentation?

Release note

ti-chi-bot bot commented Nov 12, 2024

ti-chi-bot bot commented Nov 12, 2024

sre-bot commented Nov 12, 2024 • edited Loading

OliverS929 commented Nov 12, 2024

OliverS929 commented Nov 13, 2024

OliverS929 commented Nov 13, 2024

ti-chi-bot bot commented Nov 13, 2024 • edited Loading

OliverS929 commented Nov 13, 2024

lance6716 Nov 14, 2024

Choose a reason for hiding this comment

OliverS929 Nov 14, 2024 • edited Loading

Choose a reason for hiding this comment

OliverS929 Nov 15, 2024

Choose a reason for hiding this comment

lance6716 Nov 14, 2024

Choose a reason for hiding this comment

OliverS929 Nov 14, 2024

Choose a reason for hiding this comment

OliverS929 Nov 14, 2024

Choose a reason for hiding this comment

lance6716 Nov 14, 2024

Choose a reason for hiding this comment

OliverS929 Nov 14, 2024

Choose a reason for hiding this comment

sre-bot commented Nov 12, 2024 •

edited

Loading

ti-chi-bot bot commented Nov 13, 2024 •

edited

Loading

OliverS929 Nov 14, 2024 •

edited

Loading