Major Additions
Service account support for spark-infrastructure
To have a more flexible and secure way to authenticate with AWS services, the spark-infrastructure helm chart has been enhanced to support AWS IRSA (IAM Roles for Service Accounts) authentication. See the How to Upgrade for more information.
Path to Production Alignment
To better align development processes with processes in CI/CD and higher environments, we no longer recommend using Tilt live-reloading. As such, upgrading projects should consider narrowing the scope of their Tiltfile. These changes will also help smooth the transition as further alignment is brought to the path to production. See How to Upgrade for more information.
Data Access Upgrade
Data access through GraphQL has been deprecated and replaced with Trino. Trino is optimized for performing queries against large datasets by leveraging a distributed architecture that processes queries in parallel, enabling fast and scalable data retrieval.
Spark Upgrade
Spark and PySpark have been upgraded from version 3.5.2 to 3.5.4.
Record Relation
To enable nested data records, we have added a new relation feature to the record metamodel. This allows records to reference other records. For more details, refer to the Record Relation Options in the aiSSEMBLE User Guide.
Several features are still a work in progress:
- PySpark and Spark schema based validation for relations will only validate the record and not its relations. Object based validation for relations is available.
Helm Charts Resource Specification
The following Helm charts have been updated to include the configuration options for specifying container resource requests/limits:
aissemble-spark-history-chart
aissemble-quarkus-chart
See the official Kubernetes documentation for more details.
Breaking Changes
Note: instructions for adapting to these changes are outlined in the upgrade instructions below.
- The following Java classes have been renamed:
Old Java Class New Java Class AIOpsModelInstanceRepostory
AissembleModelInstanceRepository
AiopsMdaJsonUtils
AissembleMdaJsonUtils
- To improve the development cycle and docker build consistency, we have deprecated the docker_build() and local_resources() functions in the Tilt and enable maven docker build for the docker modules. Follow the instruction in the
Finalizing the Upgrade
to avoid duplicated docker image build. - In an attempt to harden the
aissemble-hive-service
image, several changes were made that may impact projects with Hive customization - Amazon AWS SDK has been upgraded from v1 to v2, which includes a group and artifact ID change
Known Issues
Errors on Initial Build
During the first build of a new project, or the first build after an upgrade to 1.11, Baton error messages may be output saying that POM files could not be processed for each module. This is a harmless side-effect of the data-access-default-migration
and will go away on its own when the migration turns itself off by updating the deactiveMigrations
configuration of Baton.
Docker Module Build Failures
When using a Docker daemon that does not reside in /var/run
(e.g. running Rancher Desktop without admin privileges) the docker-maven-plugin will fail to build with the message below. To work around this failure, set the DOCKER_HOST
variable to the location of the daemon socket file. For example, to make the docker-maven-plugin work with Rancher Desktop, run export DOCKER_HOST=unix://$HOME/.rd/docker.sock
.
[ERROR] Failed to execute goal org.technologybrewery.fabric8:docker-maven-plugin:0.45-tb-0.1.0:build (default-build) on project final-513-spark-worker-docker:
Execution default-build of goal org.technologybrewery.fabric8:docker-maven-plugin:0.45-tb-0.1.0:build failed:
No <dockerHost> given, no DOCKER_HOST environment variable, no read/writable '/var/run/docker.sock' or '//./pipe/docker_engine' and no external provider like Docker machine configured
Known Vulnerabilities
Date identified |
Vulnerability | Severity | Package | Affected versions |
CVE | Fixed in |
---|
How to Upgrade
The following steps will upgrade your project to 1.11
. These instructions consist of multiple phases:
- Automatic Upgrades - no manual action required
- Precondition Steps - needed in all situations
- Conditional Steps (e.g., Python steps, Java steps, if you use Metadata, etc)
- Final Steps - needed in all situations
Automatic Upgrades
To reduce burden of upgrading aiSSEMBLE, the Baton project is used to automate the migration of some files to the new version. These migrations run automatically when you build your project, and are included by default when you update the build-parent
version in your root POM. Below is a description of all of the Baton migrations that are included with this version of aiSSEMBLE.
Migration Name | Description |
---|---|
upgrade-tiltfile-aissemble-version-migration | Updates the aiSSEMBLE version within your project's Tiltfile |
upgrade-v2-chart-files-aissemble-version-migration | Updates the Helm chart dependencies within your project's deployment resources (<YOUR_PROJECT>-deploy/src/main/resources/apps/ ) to use the latest version of the aiSSEMBLE |
upgrade-v1-chart-files-aissemble-version-migration | Updates the docker image tags within your project's deployment resources (<YOUR_PROJECT>-deploy/src/main/resources/apps/ ) to use the latest version of the aiSSEMBLE |
pipeline-invocation-service-template-migrtion | Include the helm.valueFiles param to ArgoCD pipeline-invocation-service template |
docker-module-pom-dependency-type-migration | Updates the maven pipeline dependency type within your project's sub docker module pom file(<YOUR_PROJECT>-docker/*-docker/pom.xml ) to fix the build cache checksum calculation issue |
enable-maven-docker-build-migration | Remove the maven fabric8 plugin skip configuration within your project's docker module pom file(<YOUR_PROJECT>-docker/pom.xml ) to enable the maven docker build |
ml-pipeline-docker-pom-migration | Adds pipeline ML pipeline dependencies to relevant docker POMs to improve the Maven build cache functionality |
training-api-image-tag-migration | Update training docker image tags to use project version |
inference-docker-image-tag-migration | Update inference docker image tags to use project version |
spark-worker-docker-image-tag-migration | Updates Spark docker image tags to use project version |
spark-infrastructure-server-side-label-migration | Enables the Server-Side Diff Strategy within ArgoCD for the Spark Infrastructure resource so that changes made by the Universal Config Store mutating webhook are ignored |
data-access-default-migration | Migrates Record metamodels that were relying on the default Data Access settings to preserve semantics with the updated default value. This migration also disables itself to ensure it's only run once. |
To deactivate any of these migrations, add the following configuration to the baton-maven-plugin
within your root pom.xml
:
<plugin>
<groupId>org.technologybrewery.baton</groupId>
<artifactId>baton-maven-plugin</artifactId>
<dependencies>
<dependency>
<groupId>com.boozallen.aissemble</groupId>
<artifactId>foundation-upgrade</artifactId>
<version>${version.aissemble}</version>
</dependency>
</dependencies>
+ <configuration>
+ <deactivateMigrations>
+ <deactivateMigration>NAME_OF_MIGRATION</deactivateMigration>
+ <deactivateMigration>NAME_OF_MIGRATION</deactivateMigration>
+ </deactivateMigrations>
+ </configuration>
</plugin>
Precondition Steps - Required for All Projects
Beginning the Upgrade
To start your aiSSEMBLE upgrade, update your project's pom.xml to use the 1.11.1 version of the build-parent:
<parent>
<groupId>com.boozallen.aissemble</groupId>
<artifactId>build-parent</artifactId>
<version>1.11.1</version>
</parent>
Tilt Docker Builds
To avoid duplicate docker builds, remove all the related docker_build()
and local_resources()
functions from your Tiltfile. Also, the spark-worker-image.yaml
is no longer used so the <project>-deploy/src/main/resources/apps/spark-worker-image directory and the related k8s_yaml()
function from your Tiltfile can be removed. If you have customized the image names in the docker_build
Tilt commands, the image name for local builds will likely need to be customized as well by updating the new docker-maven-plugin
configuration in the parent Docker POM at <project>-docker/pom.xml after running the first build.
Additionally, the aissemble-spark-worker-deploy
Fermenter profile has been removed and must be removed from the deploy POM, regardless of the Tiltfile changes.
<execution>
<id>spark-worker-image</id>
<phase>generate-sources</phase>
<goals>
<goal>generate-sources</goal>
</goals>
<configuration>
<basePackage>your.base.package</basePackage>
<profile>aissemble-spark-worker-deploy</profile>
<!-- The property variables below are passed to the Generation Context and utilized
to customize the deployment artifacts. -->
<propertyVariables>
<appName>spark-worker-image</appName>
</propertyVariables>
</configuration>
</execution>
Conditional Steps
For projects that have customized the Hive service
Several changes were made to both the Hive service Docker image and the Hive service chart included as part of the Spark Infrastructure chart of a project. The defaults have been adjusted so that these changes should be transparent, however due to the nature of some possible customizations this may not always hold true. The following changes may impact the function of your customizations and may need to be accounted for:
- The image is now only the Hive Standalone Metastore service and cannot function as a full Hive Server
- The Java installation at
/opt/java
is no longer symlinked to/opt/jre
--JAVA_HOME
has been adjusted accordingly by default - The default working directory for the
aissemble-hive-service
image was changed from/opt
to/opt/hive
- Schema initialization is no longer done as part of an
initContainer
in theaissemble-hive-service-chart
and is instead done in a newentrypoint
script. This is consistent with the officialapache/hive
Docker image.
For projects deploying to EKS
This is not a required step but a recommended way to authenticate AWS service
- Create an IAM OIDC provider for your cluster
- Follow the Assign IAM roles to Kubernetes service accounts document but skip the step that creates the service account
- In the spark-infrastructure chart template, add the service account create configuration as below:
aissemble-spark-history-chart
aissemble-spark-history-chart:
serviceAccount:
name: service-account-name
create: true
metadata:
annotations:
# Ref: IAM roles arn from step 2
eks.amazonaws.com/role-arn: arn:aws:iam::aws-id:role/iam-role-name
aissemble-thrift-server-chart:
aissemble-thrift-server-chart:
deployment:
# service account name must match the service account name specified in the IAM roles trust relationships
serviceAccountName: service-account-name
For projects using the Configuration Store
The configurationVolume
property keys in the aissemble-configuration-store
chart have been renamed to persistentVolumeProperties
and will need to be updated in <project>-deploy/src/main/resources/apps/configuration-store/values*.yaml. Consult the chart's README for more information.
Final Steps - Required for All Projects
Finalizing the Upgrade
- Run
./mvnw org.technologybrewery.baton:baton-maven-plugin:baton-migrate
to apply the automatic migrations - Run
./mvnw clean install
and resolve any manual actions that are suggested- NOTE: This will update any aiSSEMBLE dependencies in 'pyproject.toml' files automatically
- Repeat the previous step until all manual actions are resolved
What's Changed
- #470 remove out of date autogenerated readme by @meliz19 in #471
- Update to 1.11.0-SNAPSHOT by @ewilkins-csi in #478
- Include CI info in README for spark infra chart by @cpointe-ibllanos in #464
- [#469] use quotes for display version in Antora docs by @ewilkins-csi in #480
- SLACK: improve fetching pyspark profiles and clean up getImports for … by @jaebchoi in #481
- Replace graphql data access with trino by @carter-cundiff in #485
- [#483] add infrastructure module and enable ArgoCD local deployment by @csun-cpointe in #487
- #483 fix the script error by @csun-cpointe in #489
- #474 - Adding workflow step to push cucumber report to GitHub Pages branch. by @habibimoiz in #490
- #491 fix pipeline invocation service ArgoCD template missing helm.values parameter by @csun-cpointe in #492
- [#488] update extensions-helm README to work with ArgoCD by @ewilkins-csi in #493
- #484 Feature: integrate Universal Config store in the hive metastore service by @jaebchoi in #494
- 499 - Update path to redpanda image by @meliz19 in #500
- #501 Clean up ConfigMap Logic in the Hive-metastore-service to improve space complexity by @jaebchoi in #504
- #495 Add functionality for extending foundation-mda by @carter-cundiff in #505
- #482 enable docker build and fix docker module dependency cache issue by @csun-cpointe in #506
- #507 revert the 'push cucumber report to GitHub Pages branch' workflow step by @csun-cpointe in #509
- [#513] remove code preventing release of 1.11 by @ewilkins-csi in #514
- [#513] remove noRdAdmin profile by @ewilkins-csi in #515
- [#498] perform config injection on update by @ewilkins-csi in #517
- [#510] enable ml pipeline maven docker build by @csun-cpointe in #516
- #518 Resolve CVEs within hive metastore docker image by @carter-cundiff in #521
- #523 - Poetry version fixes by @meliz19 in #524
- #502 add irsa (IAM roles service account) support for spark-infrastructure by @csun-cpointe in #526
- #519 Feature: implement server side diff for mutating webhook from configstore to resolve out of sync issue in ArgoCD by @jaebchoi in #530
- #511 fix the generated python step data record import statement by @csun-cpointe in #531
- #520 Integrate relations into metamodel and remove composite types by @aaron-gary in #535
- #527 Update libraries on the Pipeline Invocation Service and Configuration Store Docker Images by @carter-cundiff in #536
- [#525] update libraries on Spark images by @ewilkins-csi in #537
- #534 implement record relation within the pojo class by @csun-cpointe in #542
- #533 Feature: Support Configuration Store values to use properties to inject values by @jaebchoi in #545
- [#540] update jars/libs in hive image by @ewilkins-csi in #551
- [#540] exclude jdk.tools frome hive dependency by @ewilkins-csi in #552
- #547 SLACK: Port secret configuration store injection support by @jaebchoi in #550
- #543 Record with relations should as Inbound/Outbound types by @carter-cundiff in #553
- Update helm charts to allow resource specification by @carter-cundiff in #554
- [#540] fix hive persistence by @ewilkins-csi in #555
- #539 Spark schema generation works with record relations by @cwoods-cpointe in #556
- #548 record with relations should be able to validate reference record fields by @csun-cpointe in #557
- [#549] fix Thrift server AWS auth locally by @ewilkins-csi in #562
- [#561] spark image hardening by @ewilkins-csi in #563
- #560 spark schemas validation retain the meta model records relations of 1-M multiplicity by @csun-cpointe in #565
- #560 address pr comments by @csun-cpointe in #566
- [538] Supporting functionality for encryption of records with relations Part 1 by @aaron-gary in #567
- #558 PySpark schema functionality works for record relations by @cwoods-cpointe in #569
- #571 Disable relations validations in Spark schema for records with relations by @carter-cundiff in #574
- [#570] pre-release cleanup by @ewilkins-csi in #575
Full Changelog: aissemble-root-1.10.0...aissemble-root-1.11.1