Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migration to CE3 and other major dependencies upgrade #3389

Merged
merged 22 commits into from
Jul 16, 2022

Conversation

pomadchin
Copy link
Member

@pomadchin pomadchin commented May 4, 2021

Overview

This PR migrates GT to:

  • Spark 3.2.x
  • CE3
  • Cassandra 4
  • JTS 1.19

It is a breaking change, but we may release it as GT 3.7.0 🤷‍♂️

In CE3 we have a concept of IORuntime, which incapsulates threadpools for blocking, and non blocking computations. There is no need in passing a confusing ExecutionContext around anymore.

Closes #3395 and a small step towards #3388
Closes #3463
Closes #3382

I decided to bump all major dependencies in terms of this PR, including Cassandra and JTS.

CQs

  • GeoTools 25.4
  • Cassandra 4.14.1
  • HBase 2.4.13
  • Cats Effect 3.3.13
  • Cats 2.8.0
  • Circe 0.14.2
  • FS2 3.2.10
  • Pureconfig 0.17.1
  • Scala XML 2.1.0
  • commons-io 2.11.0
  • AWS SDK S3 V2 2.17.228
  • Avro 1.10.2
  • scala-parser-combinators 2.1.1
  • Caffeine 2.9.3
  • woodstox-core 6.3.0
  • commons-configuration2 2.8.0
  • re2j 1.7
  • protobuf-java 3.21.2
  • squants 1.8.3
  • scalactic 3.2.12
  • shapeless 2.3.9
  • unit-api 2.1.3
  • scala-uri 4.0.2
  • scala-collection-compat 2.8.0
  • Circe Json Schema 0.2.0
  • opencsv 5.6
  • Scaffeine 4.1.0

Checklist


object IORuntimeTransient extends Serializable {
val ThreadsNumber: Int = Runtime.getRuntime.availableProcessors
@transient lazy val IORuntime: unsafe.IORuntime = unsafe.IORuntime.global
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need a companion object to be sure the runtime init is a @transient lazy val, to ensure that it is not serialized and the instance is created per machine.

@pomadchin pomadchin marked this pull request as ready for review June 27, 2022 00:40
@pomadchin pomadchin force-pushed the upd/ce3 branch 2 times, most recently from b82e0ed to 301c401 Compare June 27, 2022 00:44
@@ -74,7 +72,7 @@ object SaveToS3 {
rdd.foreachPartition { partition =>
val s3client = s3Client
val requests: fs2.Stream[IO, (PutObjectRequest, RequestBody)] =
fs2.Stream.fromIterator[IO](
fs2.Stream.fromBlockingIterator[IO](
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the S3AsyncClient usage we may eliminate a lot of blocking calls, simplify the code, and make it more resilient, see #3467

@pomadchin pomadchin force-pushed the upd/ce3 branch 9 times, most recently from a4372ef to f8c90c0 Compare July 10, 2022 16:19
@pomadchin pomadchin changed the title Migration to CE3 Migration to CE3 and major dependencies upgrade Jul 10, 2022
@pomadchin
Copy link
Member Author

If someone has any time / will to help with CQs that is very much appreciated :D

@pomadchin
Copy link
Member Author

I think I'm going to merge this in, and later handle CQs. It's a pretty major change so lets base all the new work on top of it.

3.7.0 🚀

@pomadchin pomadchin merged commit 0276d78 into locationtech:master Jul 16, 2022
@pomadchin pomadchin deleted the upd/ce3 branch July 16, 2022 16:43
@pomadchin
Copy link
Member Author

pomadchin commented Nov 3, 2022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Revert graceful JTS fall back and lazy Circe encoders Cats Effects 3 upgrade Update Cassandra up to 4.x
2 participants