Skip to content

Commit

Permalink
Closes #1426: Run IIS experiments by relying on spark 3.4 version
Browse files Browse the repository at this point in the history
WIP.

Removing `provided` scope from the `spark-avro_2.12` dependency until making it part of sharelib342.
Introducing required fixes for `eu/dnetlib/iis/wf/export/actionmanager/relation/citation/default` integration test to let it run relying on spark3:
* setting `spark.extraListeners` and `spark.sql.queryExecutionListeners` explicitly to empty values in order to avoid relying on incompatible, spark2 compliant, cloudera listeners
* setting `spark.shuffle.useOldFetchProtocol=true` to address `2.4 to 3.0 migration guide` requirement regarding protocol for fetching shuffle blocks backward compatibility (and avoiding `IllegalArgumentException: Unexpected message type: <number>` kind of errors)
  • Loading branch information
marekhorst committed Oct 3, 2023
1 parent 9587b17 commit 29c165d
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 12 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -42,16 +42,6 @@
<name>oozieActionShareLibForSpark2</name>
<description>oozie action sharelib for spark 2.*</description>
</property>
<property>
<name>spark2ExtraListeners</name>
<value>com.cloudera.spark.lineage.NavigatorAppListener</value>
<description>spark 2.* extra listeners classname</description>
</property>
<property>
<name>spark2SqlQueryExecutionListeners</name>
<value>com.cloudera.spark.lineage.NavigatorQueryListener</value>
<description>spark 2.* sql query execution listeners classname</description>
</property>
<property>
<name>spark2YarnHistoryServerAddress</name>
<description>spark 2.* yarn history server address</description>
Expand Down Expand Up @@ -94,10 +84,11 @@
--executor-memory=${sparkExecutorMemory}
--executor-cores=${sparkExecutorCores}
--driver-memory=${sparkDriverMemory}
--conf spark.extraListeners=${spark2ExtraListeners}
--conf spark.sql.queryExecutionListeners=${spark2SqlQueryExecutionListeners}
--conf spark.extraListeners=
--conf spark.sql.queryExecutionListeners=
--conf spark.yarn.historyServer.address=${spark2YarnHistoryServerAddress}
--conf spark.eventLog.dir=${nameNode}${spark2EventLogDir}
--conf spark.shuffle.useOldFetchProtocol=true
</spark-opts>
<arg>-inputCitationsPath=${input_citations}</arg>
<arg>-outputRelationPath=${output_root_relations}/${action_set_id_citation_relations}</arg>
Expand Down
3 changes: 3 additions & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -386,7 +386,10 @@
<groupId>org.apache.spark</groupId>
<artifactId>spark-avro_2.12</artifactId>
<version>${iis.spark.version}</version>
<!-- FIXME not available in spark3 sharelib folder yet, commenting out until finishing the testing phase -->
<!--
<scope>provided</scope>
-->
</dependency>

<dependency>
Expand Down

0 comments on commit 29c165d

Please sign in to comment.