Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix a bug that when catalog is unified catalog the background refresh for hive connector do not work #55215

Merged
merged 2 commits into from
Jan 21, 2025

Conversation

duanyyyyyyy
Copy link
Contributor

@duanyyyyyyy duanyyyyyyy commented Jan 17, 2025

Fix a bug that when catalog is unified catalog the background refresh for hive connector do not work

Why I'm doing:

We found that when we use unified catalog the background refresh will not work.
Because that when we register into the map, we will use the same key.

What I'm doing:

Introduce a new object named CatalogNameType and use this object as the key.
Fixes #issue

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.4
    • 3.3
    • 3.2
    • 3.1
    • 3.0

@duanyyyyyyy duanyyyyyyy requested a review from a team as a code owner January 17, 2025 11:31
@duanyyyyyyy duanyyyyyyy force-pushed the background_refresh_bug_fix branch from 3c683fa to 40ae02d Compare January 17, 2025 11:32
@duanyyyyyyy duanyyyyyyy force-pushed the background_refresh_bug_fix branch 2 times, most recently from b5f5f94 to 55b0ee0 Compare January 17, 2025 16:03
@duanyyyyyyy
Copy link
Contributor Author

Hi @stephen-shelby
Any comments for this one?

@Youngwb
Copy link
Contributor

Youngwb commented Jan 21, 2025

@duanyyyyyyy Do you mean there is only one type catalog like hive catalog could refresh when you create a unified catalog? but hudi/delta lake catalog could not refresh?

@duanyyyyyyy
Copy link
Contributor Author

@duanyyyyyyy Do you mean there is only one type catalog like hive catalog could refresh when you create a unified catalog? but hudi/delta lake catalog could not refresh?

Yes, for example when I create the unified catalog, only delta lake can be refresh background.
But Hive can not, because we firstly put hive connector processor into the map.
And after that we will put the delta lake into it, but the key is the same....

@duanyyyyyyy
Copy link
Contributor Author

@duanyyyyyyy Do you mean there is only one type catalog like hive catalog could refresh when you create a unified catalog? but hudi/delta lake catalog could not refresh?

image
The screenshot can shown the sequence...


public class ConnectorProcessorName {

private final String catalogName;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think catalogType is better

private final HiveConnectorInternalMgr internalMgr;
private final HiveMetadataFactory metadataFactory;

public HiveConnector(ConnectorContext context) {
this.properties = context.getProperties();
this.catalogName = context.getCatalogName();
this.processorName = new ConnectorProcessorName(catalogName, "hive_connector");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
this.processorName = new ConnectorProcessorName(catalogName, "hive_connector");
this.processorName = new ConnectorProcessorName(catalogName, "hive");

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maintain consistency with Delta Lake.

@@ -41,6 +43,7 @@ public HudiConnector(ConnectorContext context) {
CloudConfiguration cloudConfiguration = CloudConfigurationFactory.buildCloudConfigurationForStorage(properties);
HdfsEnvironment hdfsEnvironment = new HdfsEnvironment(cloudConfiguration);
this.catalogName = context.getCatalogName();
this.processorName = new ConnectorProcessorName(catalogName, "hudi_connector");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
this.processorName = new ConnectorProcessorName(catalogName, "hudi_connector");
this.processorName = new ConnectorProcessorName(catalogName, "hudi");

cacheUpdateProcessors.put(catalogName, cache);
public void registerCacheUpdateProcessor(ConnectorProcessorName processorName, CacheUpdateProcessor cache) {
LOG.info("register to update {} metadata cache from {} in the ConnectorTableMetadataProcessor",
processorName.getConnectorName(), processorName.getConnectorName());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why log ConnectorName twice, plz print more explicit information.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could print like this

register to update {}:{} metadata cache in the ConnectorTableMetadataProcessor", catalogName, catalogType

@duanyyyyyyy duanyyyyyyy force-pushed the background_refresh_bug_fix branch from 44f5cbd to af843a9 Compare January 21, 2025 03:44
…for hive connector do not work

Signed-off-by: duanyyyyyyy <[email protected]>
Signed-off-by: duanyyyyyyy <[email protected]>
@duanyyyyyyy duanyyyyyyy force-pushed the background_refresh_bug_fix branch from af843a9 to accdf9e Compare January 21, 2025 03:54
Copy link

Copy link

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

[FE Incremental Coverage Report]

pass : 23 / 26 (88.46%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 com/starrocks/connector/hive/ConnectorTableMetadataProcessor.java 9 12 75.00% [107, 108, 109]
🔵 com/starrocks/connector/hive/CatalogNameType.java 6 6 100.00% []
🔵 com/starrocks/connector/delta/DeltaLakeConnector.java 3 3 100.00% []
🔵 com/starrocks/connector/hive/HiveConnector.java 3 3 100.00% []
🔵 com/starrocks/connector/hudi/HudiConnector.java 2 2 100.00% []

Copy link

[BE Incremental Coverage Report]

pass : 0 / 0 (0%)

@duanyyyyyyy
Copy link
Contributor Author

Hi @Youngwb
Any other comments?

@Youngwb Youngwb enabled auto-merge (squash) January 21, 2025 05:55
@Youngwb Youngwb merged commit 8216577 into StarRocks:main Jan 21, 2025
54 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants