HIVE-28197: Add deserializer to convert JSON plans to RelNodes #6067

soumyakanti3578 · 2025-09-08T17:54:41Z

What changes were proposed in this pull request?

Added a deserializer to convert JSON plans to logical plans (RelNodes)

Why are the changes needed?

While we can serialize a plan to JSON with explain cbo formatted, we didn't have a deserializer to convert back to a RelNode.

Does this PR introduce any user-facing change?

No

How was this patch tested?

mvn test -pl ql -Dtest=org.apache.hadoop.hive.ql.optimizer.calcite.TestRelPlanParser

sonarqubecloud · 2025-09-23T22:06:28Z

Quality Gate passed

Issues
64 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

zabetak

Didn't fully went through the changes but sending a first batch of comments in order not to lose them. Let me finalize the review before starting making code changes. For comments that simply require an answer feel free to share your thoughts.

zabetak · 2025-09-25T06:53:20Z

ql/src/java/org/apache/hadoop/hive/ql/metadata/NotNullConstraint.java

        String enable = pk.isEnable_cstr()? "ENABLE": "DISABLE";
        String validate = pk.isValidate_cstr()? "VALIDATE": "NOVALIDATE";
        String rely = pk.isRely_cstr()? "RELY": "NORELY";
-        enableValidateRely.put(pk.getNn_name(), ImmutableList.of(enable, validate, rely));


Why is this change necessary?

zabetak · 2025-09-25T06:55:08Z

ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveAggregate.java

+    this(input.getCluster(), input.getTraitSet(), input.getInput(),
+        input.getBitSet("group"), input.getBitSetList("groups"), input.getAggregateCalls("aggs"));


Can the following work?

Suggested change

this(input.getCluster(), input.getTraitSet(), input.getInput(),

input.getBitSet("group"), input.getBitSetList("groups"), input.getAggregateCalls("aggs"));

super(input);

If yes then can I do the same on the other RelNodes?

zabetak · 2025-09-25T06:58:03Z

ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveMultiJoin.java

+  public HiveMultiJoin(RelInput input) {
+    this(
+        input.getCluster(),
+        input.getInputs(),
+        input.getExpression("condition"),
+        input.getRowType("rowType"),
+        (List<Pair<Integer, Integer>>) input.get("getJoinInputsForHiveMultiJoin"),
+        (List<JoinRelType>) input.get("getJoinTypesForHiveMultiJoin"),
+        input.getExpressionList("filters")
+    );
+  }
+


Why do we to modify this class? Normally we shouldn't need to serialize/deserialize MultiJoin expressions cause they never appear in the final plan.

zabetak · 2025-09-25T07:00:22Z

ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveRelNode.java

+  static Stream<RelNode> stream(RelNode node) {
+    return Stream.concat(
+        Stream.of(node),
+        node.getInputs()
+            .stream()
+            .flatMap(HiveRelNode::stream)
+    );
+  }


If we keep this we should add appropriate Javadoc. In addition, putting static methods in interfaces is not a good pattern; it is better to move it to a utility class.

Other than that the most common way to traverse RelNode tree is via visitor and shuttles so not sure if this kind of Stream based traversal is something that will be well adopted.

zabetak · 2025-09-25T07:12:38Z

ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/FilterSelectivityEstimator.java

   * @param t
   * @return
   */
  private long getMaxNulls(RexCall call, HiveTableScan t) {


Why are we changing the selectivity estimator?

zabetak · 2025-09-25T10:27:09Z

ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOptUtil.java

+    RelPlanParser parser = new RelPlanParser(cluster, conf);
+    RelNode deserializedPlan = parser.parse(jsonPlan);
+    // Apply partition pruning to compute partition list in HiveTableScan
+    deserializedPlan = applyPartitionPruning(conf, deserializedPlan, cluster, planner);


Why do we need the partition list? Can't we deserialize the plan without it?

zabetak · 2025-09-25T10:31:10Z

ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOptUtil.java

+    // Apply partition pruning to compute partition list in HiveTableScan
+    deserializedPlan = applyPartitionPruning(conf, deserializedPlan, cluster, planner);
+    if (LOG.isDebugEnabled()) {
+      LOG.debug("Deserialized plan: \n{}", RelOptUtil.toString(deserializedPlan));


Consider removing logging from this API. Same reasons as the one mentioned before.

zabetak · 2025-09-25T10:48:32Z

ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveTableScan.java

+      return null;
+    }
+
+    return HiveRelEnumTypes.toEnum(enumName);


The use of HiveRelEnumTypes seems a bit of an overkill. Can't we simply create the instance directly and drop the entire RelEnumTypes copy?

Suggested change

return HiveRelEnumTypes.toEnum(enumName);

return HiveTableScanTrait.valueOf(enumName);

zabetak · 2025-09-25T10:49:38Z

ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveTableScan.java

+    if (enumName == null) {
+      return null;
+    }


Are there cases where we don't serialize the trait? Can we ever have null here?

zabetak · 2025-09-25T11:43:35Z

ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelOptUtil.java

+    }
+
+    JSONObject outJSONObject = new JSONObject(new LinkedHashMap<>());
+    outJSONObject.put("CBOPlan", serializeWithPlanWriter(plan, new HiveRelJsonImpl()));


I don't think we need the extra wrapping attribute for "CBOPlan".

asf-ci-hive added tests pending tests failed and removed tests pending labels Sep 8, 2025

soumyakanti3578 force-pushed the deserializer-HIVE-28197 branch from a63b1f9 to 39b0ec0 Compare September 10, 2025 22:03

asf-ci-hive added tests pending tests unstable and removed tests failed tests pending labels Sep 10, 2025

asf-ci-hive added tests pending tests unstable and removed tests unstable tests pending labels Sep 19, 2025

HIVE-28197: Add deserializer to convert JSON plans to RelNodes

92b45f8

soumyakanti3578 force-pushed the deserializer-HIVE-28197 branch from 8dad604 to 92b45f8 Compare September 23, 2025 21:10

asf-ci-hive added tests pending and removed tests unstable labels Sep 23, 2025

asf-ci-hive added tests passed and removed tests pending labels Sep 23, 2025

soumyakanti3578 changed the title ~~[WIP] - DO NOT REVIEW - Deserializer hive 28197~~ HIVE-28197: Add deserializer to convert JSON plans to RelNodes Sep 24, 2025

soumyakanti3578 marked this pull request as ready for review September 24, 2025 17:46

zabetak reviewed Sep 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HIVE-28197: Add deserializer to convert JSON plans to RelNodes #6067

HIVE-28197: Add deserializer to convert JSON plans to RelNodes #6067

Uh oh!

soumyakanti3578 commented Sep 8, 2025 •

edited

Loading

Uh oh!

sonarqubecloud bot commented Sep 23, 2025

Uh oh!

zabetak left a comment

Uh oh!

zabetak Sep 25, 2025

Uh oh!

zabetak Sep 25, 2025

Uh oh!

zabetak Sep 25, 2025

Uh oh!

zabetak Sep 25, 2025

Uh oh!

zabetak Sep 25, 2025

Uh oh!

zabetak Sep 25, 2025

Uh oh!

zabetak Sep 25, 2025

Uh oh!

zabetak Sep 25, 2025

Uh oh!

zabetak Sep 25, 2025

Uh oh!

zabetak Sep 25, 2025

Uh oh!

Uh oh!

		this(input.getCluster(), input.getTraitSet(), input.getInput(),
		input.getBitSet("group"), input.getBitSetList("groups"), input.getAggregateCalls("aggs"));

	this(input.getCluster(), input.getTraitSet(), input.getInput(),
	input.getBitSet("group"), input.getBitSetList("groups"), input.getAggregateCalls("aggs"));
	super(input);

	return HiveRelEnumTypes.toEnum(enumName);
	return HiveTableScanTrait.valueOf(enumName);

HIVE-28197: Add deserializer to convert JSON plans to RelNodes #6067

Are you sure you want to change the base?

HIVE-28197: Add deserializer to convert JSON plans to RelNodes #6067

Uh oh!

Conversation

soumyakanti3578 commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

sonarqubecloud bot commented Sep 23, 2025

Quality Gate passed

Uh oh!

zabetak left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

soumyakanti3578 commented Sep 8, 2025 •

edited

Loading