-
Notifications
You must be signed in to change notification settings - Fork 10
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
#5 📝 Tranche 4 of documentation migration
- Loading branch information
1 parent
8828ab4
commit 14854ae
Showing
7 changed files
with
923 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
= Alerting | ||
|
||
== Overview | ||
|
||
The purpose of alerting is to bring attention to significant events and issues that arise during execution of a pipeline | ||
by sending messages via email, Slack, etc. To simplify the incorporation of alerting, pre-constructed patterns have been | ||
developed and can be included in a https://github.com/boozallen/aissemble[Solution Baseline,role=external,window=_blank] | ||
project. This means there are only a few steps necessary to incorporate generated code for alerting purposes. This page | ||
is intended to explain the generated components that are included when alerting is enabled, and determining where to | ||
modify and customize elements to suit a specific implementation. | ||
|
||
== What Gets Generated | ||
Alerting is xref:pipeline-metamodel.adoc#_pipeline_metamodel[enabled by default] | ||
for projects that have a pre-fab data delivery pipeline. | ||
|
||
[WARNING] | ||
Alerting is currently only available for Spark Data Delivery Pipelines and will be available for PySpark Data Delivery | ||
and Machine Learning Pipelines in a future version. | ||
|
||
=== Default Method for Sending Alerts | ||
When alerting is enabled, a few methods (outlined below) are generated in the base class of each step. These methods are | ||
called automatically upon step completion (whether successfully or with an exception) to send an alert. All of these | ||
methods have default logic, but can be customized by overriding the method in the step implementation class. | ||
|
||
.sendAlerts | ||
[source] | ||
---- | ||
protected void sendAlert(Alert.Status status, String message) | ||
Send an alert with a given status and message. | ||
Override this method to customize how messages are sent to the alerting framework. | ||
Parameters: | ||
status – the status of the alert | ||
message – the message | ||
---- | ||
|
||
.getSuccessMessage | ||
[source] | ||
---- | ||
protected String getSuccessMessage(Map<String, String> params) | ||
Returns the message sent via alerting when the step completes successfully. Override this method to provide your own success message. | ||
Parameters: | ||
params – map of parameters for the success message including the execution duration under the key timeToComplete. | ||
Returns: | ||
Success message with the action and the time to complete. | ||
---- | ||
|
||
.getErrorMessage | ||
[source] | ||
---- | ||
protected String getErrorMessage(Exception e) | ||
Returns the message sent via alerting when the step throws an exception. Override this method to provide your own error message. | ||
Parameters: | ||
e – The exception that caused the step to fail. | ||
Returns: | ||
The detailed error message. | ||
---- | ||
|
||
== Configuring Your Alerting Service | ||
The Solution Baseline provides several integration options for alerting purposes. | ||
|
||
=== Alerting with Slack | ||
The default alerting implementation is Slack. To use Slack Alerting, follow the steps below: | ||
|
||
. Add the aiSSEMBLE Slack alerting dependency `extensions-alerting-slack` to the pipeline POM: | ||
[source,xml] | ||
---- | ||
<dependencies> | ||
... | ||
<dependency> | ||
<groupId>com.boozallen.aissemble</groupId> | ||
<artifactId>extensions-alerting-slack</artifactId> | ||
</dependency> | ||
... | ||
</dependencies> | ||
---- | ||
|
||
[start=2] | ||
. Add the SlackConsumer bean to the pipeline within the PipelinesCdiContext.java file | ||
|
||
[source,java] | ||
---- | ||
public List<Class<?>> getCdiClassses() { | ||
// Add any customer CDI classes here | ||
... | ||
customBeans.add(SlackConsumer.class) | ||
return customBeans; | ||
} | ||
---- | ||
|
||
[start=3] | ||
. Create the slack-integration.properties in the following path: | ||
`<project>-docker/<project>-spark-worker-docker/src/main/resources/krausening/base/slack-integration.properties` | ||
|
||
=== Kafka Integration | ||
The default alerting implementation can be extended to publish the alerts to an Apache Kafka topic. Adding a | ||
`microprofile-config.properties` file with the following configurations will enable the Kafka integration for the | ||
default Alert Producer: | ||
|
||
.<spark-data-delivery-pipeline>/src/main/resources/META-INF/microprofile-config.properties | ||
[source] | ||
---- | ||
kafka.bootstrap.servers=kafka-cluster:9093 <1> | ||
mp.messaging.outgoing.alerts.connector=smallrye-kafka | ||
mp.messaging.outgoing.alerts.topic=kafka-alert-topic-name <2> | ||
mp.messaging.outgoing.alerts.key.serializer=org.apache.kafka.common.serialization.StringSerializer | ||
mp.messaging.outgoing.alerts.value.serializer=org.apache.kafka.common.serialization.StringSerializer | ||
---- | ||
<1> The hostname and port of the Kafka server to connect to. | ||
<2> The name of the Kafka topic to publish the alerts to. | ||
|
||
Please see the https://smallrye.io/smallrye-reactive-messaging/latest/kafka/kafka[SmallRye documentation,role=external,window=_blank] | ||
on the Kafka connector for more configuration details. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
[#_bias_detection] | ||
= Bias Detection | ||
Bias detection, also known as Ethical Artificial Intelligence (AI), is concerned with determining if an AI model | ||
systematically produces inaccurate results due to flawed assumptions. One contributing factor to model bias is the data | ||
it learns from. By driving bias detection from a semantic data model, consistent bias detection policies are applied | ||
throughout the data on the related field(s). | ||
To implement bias detection within your project, please contact the https://stackoverflowteams.com/c/boozallensolutioncenter/questions[aiSSEMBLE team] | ||
for integration and implementation guidance. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
= Deploying the Project | ||
:source-highlighter: rouge | ||
|
||
AI/ML projects are generally built using scripts or notebooks, well suited for prototyping and simple implementations | ||
but lacking Software Development Lifecycle (SDLC) best practices such as unit/integration testing, peer reviews, and a | ||
consistent build process. aiSSEMBLE provides a structured approach for designing, developing, deploying, and monitoring | ||
AI/ML solutions to standardize delivery and drive consistency and reliability. A key component of this approach is | ||
automating the building, testing, and deployment of software through Continuous Integration and Continuous Delivery | ||
(CI/CD). The following outlines the deployment and delivery approach in aiSSEMBLE. | ||
|
||
== Deployment Artifacts | ||
aiSSEMBLE makes your project portable, scalable, and platform-agnostic by using Docker to create “images” which are | ||
blueprints for containers. https://docs.docker.com/build/[Docker,role=external,window=_blank] is a software platform | ||
designed to help developers build, share, and run modern applications. Docker is used in aiSSEMBLE to create portable | ||
software components packaged up for deployment in a containerized environment. | ||
|
||
Container orchestration is important for automating deployments. https://kubernetes.io/docs/home/[Kubernetes,role=external,window=_blank], | ||
also known as K8s, is an open-source system for automating deployment, scaling, and management of containerized | ||
applications. aiSSEMBLE generates Kubernetes artifacts to ease the management and scalability of your project. | ||
|
||
Helm is used in aiSSEMBLE as the package management tool and template engine for Kubernetes. https://helm.sh/docs/[Helm,role=external,window=_blank] | ||
is a tool that streamlines installing and managing Kubernetes applications. Think of it like apt/yum/homebrew for | ||
Kubernetes. Helm packages and deploys aiSSEMBLE’s Kubernetes applications while also providing templating services that | ||
allows for easy modifications. | ||
|
||
== Deployment Infrastructure | ||
|
||
=== Local Deployment | ||
aiSSEMBLE’s framework enables rapid development and testing by ensuring local build and deployment processes are fast, | ||
alleviating the need for ad-hoc scripts and notebooks. To achieve this, your project needs the ability to be deployed in | ||
an environment where it can be easily stood up and torn down locally. In doing so, you ensure when you deploy your | ||
project to a higher environment, all the pieces work together cohesively, similar to how they would in production. The two | ||
necessary components you require to get to this state is a local Kubernetes environment and a local deployment tool for | ||
Kubernetes. | ||
|
||
The aiSSEMBLE team promotes the usage of https://docs.rancherdesktop.io/[Rancher Desktop,role=external,window=_blank] | ||
for the local Kubernetes environment and management tool. Rancher Desktop is a light-weight, user-friendly tool which | ||
comes packaged with critical tools such as Helm, Docker and Kubernetes. By deploying to a real Kubernetes environment, | ||
Rancher Desktop allows you to test integration points between the key components of your project. | ||
|
||
In order to ease testing in your local Kubernetes environment, there is a need for a simple tool that can deploy your | ||
entire project quickly. The aiSSEMBLE team encourages the usage of https://docs.tilt.dev/[Tilt,role=external,window=_blank] | ||
as your local deployment tool for Kubernetes. By default, aiSSEMBLE will generate Tilt deployment files to get you | ||
started. Tilt can deploy your project (in its entirety or partially) with a single command and provides a user-friendly | ||
interface to monitor your container activity and logs. In addition, Tilt keeps the deployment up to date with the latest | ||
code changes with very little downtime. | ||
|
||
=== Remote Deployment | ||
Including continuous integration (CI) is a best practice for unit/integration testing and consistent builds. By default, | ||
aiSSEMBLE will include starter Jenkins CI pipelines for building, testing, packaging, and deploying your project. | ||
Jenkins is an open-source, automation, DevOps tool commonly used for CI. | ||
|
||
aiSSEMBLE enables standardized delivery and monitoring to drive consistency and reliability. ArgoCD is a tool which | ||
deploys and continuously monitors running applications and compares the current, live state against the desired target | ||
state. aiSSEMBLE promotes ArgoCD’s app of apps pattern in the Helm charts generated for your project. | ||
|
||
|
||
== Related Pages | ||
|
||
- xref:guides/guides-spark-job.adoc[] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,149 @@ | ||
= Data Access | ||
|
||
== Overview | ||
Data access is the process of exposing data to external consumers. aiSSEMBLE supports this through generated services | ||
and records. | ||
|
||
== What Gets Generated | ||
Data access is xref:pipeline-metamodel.adoc#_pipeline_metamodel[enabled by default] for projects that include at least | ||
one record. When enabled, aiSSEMBLE generates a https://graphql.org/learn/[GraphQL,role=external,window=_blank] query | ||
service with endpoints for retrieving records from ingested datasets. | ||
|
||
|=== | ||
|Generated file | Description | ||
|
||
|`<project>/<project>-pipelines/<project>-data-access/pom.xml` | ||
|Creates the Maven module that builds the generated query service. | ||
|
||
|`<project>/<project>-pipelines/<project>-data-access/src/main/resources/application.properties` | ||
|https://quarkus.io/guides/config[Quarkus,role=external,window=_blank] configuration of the query service. | ||
|
||
|`<project>/<project>-pipelines/<project>-data-access/src/main/java/com/test/DataAccessResource.java` | ||
|GraphQL resource that exposes the /graphql REST endpoint for data access requests. | ||
|=== | ||
|
||
=== GraphQL API | ||
GraphQL queries are generated based on the record metamodel(s) in `<project>/<project>-pipeline-models/src/main/resources/records/`. | ||
By default, two queries are generated for each record metamodel: one for retrieving all the results from a table, and | ||
one for retrieving a limited number of results from a table. The methods that implement these queries can be found in | ||
`<project>/<project>-pipelines/<project>-data-access/src/generated/java/<user-defined-package>/DataAccessResourceBase.java`. | ||
These methods can be overridden, or new queries can be added by modifying `<project>/<project>-pipelines/<project>-data-access/src/main/java/<user-defined-package>/DataAccessResource.java` | ||
|
||
|
||
.GraphQL query to pull records from a given table: | ||
[source,json] | ||
---- | ||
query auditList { | ||
TaxPayer(table: delinquent_tax_payers) | ||
{ | ||
id | ||
} | ||
} | ||
---- | ||
|
||
|=== | ||
|Element | Element Type | Element Description | ||
|
||
|auditList | ||
|Operation name | ||
|Name of the query. The name assigned to this operation has no correlation to the pipeline or metamodel, this is simply based on your choosing. | ||
|
||
|TaxPayer | ||
|Query object | ||
|The type of record that you are pulling from data store. This name is derived from your record metamodel. | ||
|
||
|delinquent_tax_payers | ||
|Argument | ||
|Name of the table being queried. In the execution of the data pipeline, your records are stored in a table with the | ||
name you specified in your step implementation. | ||
|
||
|id (String) | ||
|Variable | ||
|Field from the record type being returned. The available fields correspond with the fields within your record metamodel. | ||
|=== | ||
|
||
.GraphQL query to pull records from a given table with a limit: | ||
[source,json] | ||
---- | ||
query auditList { | ||
TaxPayerLimited(table: delinquent_tax_payers, limit: 10) | ||
{ | ||
id | ||
} | ||
} | ||
---- | ||
|
||
|=== | ||
|Element | Element Type | Element Description | ||
|
||
|auditList | ||
|Operation name | ||
|Name of the query. The name assigned to this operation has no correlation to the pipeline or metamodel, this is simply based on your choosing. | ||
|
||
|TaxPayerLimited | ||
|Query object | ||
|The type of record that you are pulling from data store. This name is derived from your record metamodel. | ||
|
||
|delinquent_tax_payers | ||
|Argument | ||
|Name of the table being queried. In the execution of the data pipeline, your records are stored in a table with the name | ||
you specified in your step implementation. | ||
|
||
|limit (int) | ||
|Argument | ||
|Field from the record type being returned. The available fields correspond with the fields within your record metamodel. | ||
|
||
|id (String) | ||
|Variable | ||
|Limit on how many records is to be returned from the query. | ||
|=== | ||
|
||
To invoke the GraphQL query, you will need to do so via a REST API call. | ||
|
||
=== POST/graphql | ||
.Returns the records for the given GraphQL query. | ||
[%collapsible] | ||
==== | ||
// .POST/graphql | ||
**** | ||
// Returns the records for the given GraphQL query. | ||
*Parameters* | ||
|=== | ||
|*Name* | *Description* | ||
|query | ||
|https://graphql.org/learn/queries/[GraphQL query,role=external,window=_blank] executed to retrieve the data. | ||
|=== | ||
*Return* | ||
[cols="1,1"] | ||
|=== | ||
|{record-name} records. | ||
|List of records. The record will be based on your record metamodel. | ||
|=== | ||
.Sample data input: | ||
[source,JSON] | ||
---- | ||
{ | ||
"query": "{ ExampleDataLimited(table: \" example_table \", limit: 10) { id } }" | ||
} | ||
---- | ||
.Sample data output: | ||
[source,JSON] | ||
---- | ||
{ | ||
"data": { | ||
"ExampleData": [] | ||
} | ||
} | ||
---- | ||
**** | ||
==== | ||
|
||
=== Deployment Artifacts | ||
Once a data access record has been defined, aiSSEMBLE will also generate deployment artifacts like Docker images, | ||
Kubernetes manifests, and Tilt configurations. For more information, see the xref:containers.adoc#_containers[Containers] page. |
Oops, something went wrong.