diff --git a/README.md b/README.md index 439a8450..a309c940 100644 --- a/README.md +++ b/README.md @@ -1,10 +1,44 @@ -# Arcaflow Engine +# Arcaflow: The Noble Workflow Engine +Arcaflow logo showing a waterfall and a river with
+3 trees symbolizing the various plugins -The Arcaflow Engine allows you to run workflows using container engines, such as Docker or Kubernetes. The plugins must be built with the [Arcaflow SDK](https://arcalot.io/arcaflow/creating-plugins/python/). +Arcaflow is a highly-flexible and portable workflow system that helps you to build +pipelines of actions via plugins. Plugin steps typically perform one action well, +creating or manipulating data that is returned in a machine-readable format. Data is +validated according to schemas as it passes through the pipeline in order to clearly +diagnose type mismatch problems early. Arcaflow runs on your laptop, a jump host, or in +a CI system, requiring only the Arcaflow engine binary, a workflow definition in YAML, +and a compatible container runtime. -## Pre-built binaries +[Complete Arcaflow Documentation](https://arcalot.io/arcaflow) -If you want to use our pre-built binaries, you can find them in the [releases section](https://github.com/arcalot/arcaflow-engine/releases). +
+ +![image](arcaflow-basic-demo.gif) + +# The Arcaflow Engine + +The Arcaflow Engine is the core execution component for workflows. It allows you to use +actions provided by containerized plugins to build pipelines of work. The Arcaflow +engine can be configured to run plugins using Podman, Docker, and Kubernetes. + +An ever-growing catalog of +[official plugins](https://github.com/orgs/arcalot/repositories?q=%22arcaflow-plugin-%22) +are maintained within the Arcalot organization and are available as +[versioned containers from Quay.io](https://quay.io/organization/arcalot). You can also +build your own containerized plugins using the the Arcaflow SDK, available for +[Python](https://arcalot.io/arcaflow/plugins/python/) and +[Golang](https://arcalot.io/arcaflow/plugins/go/). We encourage you to +contribute your plugins to the community, and you can start by adding them to the +[plugins incubator](https://github.com/arcalot/arcaflow-plugins-incubator) repo via a +pull request. + +## Pre-built engine binaries + +Our pre-built engine binaries are available in the +[releases section](https://github.com/arcalot/arcaflow-engine/releases) for multiple +platforms and architectures. ## Building from source @@ -14,15 +48,16 @@ This system requires at least Go 1.18 to run and can be built from source: go build -o arcaflow cmd/arcaflow/main.go ``` -This binary can then be used to run Arcaflow workflows. +This self-contained engine binary can then be used to run Arcaflow workflows. -## Building a simple workflow +## Running a simple workflow -The simplest workflow is the example plugin workflow using the workflow schema version `v0.2.0`: (save it to workflow.yaml) +A set of [example workflows](https://github.com/arcalot/arcaflow-workflows) is available +to demonstrate workflow features. A basic example `workflow.yaml` may look like this: ```yaml -version: v0.2.0 -input: +version: v0.2.0 # The compatible workflow schema version +input: # The input schema for the workflow root: RootObject objects: RootObject: @@ -31,108 +66,107 @@ input: name: type: type_id: string -steps: +steps: # The individual steps of the workflow example: - plugin: ghcr.io/janosdebugs/arcaflow-example-plugin - # step: step-id if the plugin has more than one step - # deploy: - # type: docker|kubernetes - # ... more options + plugin: + deployment_type: image + src: quay.io/arcalot/arcaflow-plugin-example input: name: !expr $.input.name -output: - message: !expr $.steps.example.outputs.success.message +outputs: # The expected output schema and data for the workflow + success: + message: !expr $.steps.example.outputs.success.message ``` -As you can see, it has a `version`, `input`, a list of `steps`, and an `output` definition. Each of these keys is required in a workflow. These can be linked together using JSONPath expressions (not all features are supported). The expressions also determine the execution order of plugins. +As you can see, a workflow has the root keys of `version`, `input`, `steps`, and +`outputs`. Each of these keys is required in a workflow. Output values and inputs to +steps can be specified using the Arcaflow +[expression language](https://arcalot.io/arcaflow/workflows/expressions/). Input and +output references create dependencies between the workflow steps which determine their +execution order. -You can now create an input YAML for this workflow: (save it to input.yaml) +An input YAML file for this basic workflow may look like this: ```yaml name: Arca Lot ``` -If you have a local Docker / Moby setup installed, you can run it immediately: +The Arcaflow engine uses a configuration to define the standard behaviors for deploying +plugins within the workflow. The default configuration will use Podman to run the +container and will set the log outputs to the `info` level. +If you have a local Podman setup installed, you can simply run the workflow like this: + +```bash +arcaflow --input input.yaml ``` -./arcaflow -input input.yaml -``` -If you don't have a local Docker setup, you can also create a `config.yaml` with the following structure: +This results in the default behavior of using the built-in configuration and reading the +workflow from the `workflow.yaml` file in the current working directory. + +If you don't have a local Podman setup, or if you want to use another deployer or any +custom configuration parameters, you can create a `config.yaml` with your desired +parameters. For example: ```yaml deployers: image: - deployer_name: docker|podman|kubernetes - python: - deployer_name: python - # More deployer options + deployer_name: docker log: - level: debug|info|warning|error + level: debug +logged_outputs: + error: + level: debug ``` -You can load this config by passing the `-config` flag to Arcaflow. +You can load this config by passing the `--config` flag to Arcaflow. -### Supported Workflow Schema Versions - -- v0.2.0 +```bash +arcaflow --input input.yaml --config config.yaml +``` -## Deployer options +The default workflow file name is `workflow.yaml`, but you can override this with the +`--workflow` input parameter. -Currently, the two deployer types supported are Docker and Kubernetes. +Arcaflow also accepts a `--context` parameter that defines the base directory for all +input files. All relative file paths are from the context directory, and absolute paths +are also supported. The default context is the current working directory (`.`). -### The Docker deployer +### A few command examples... -This deployer uses the Docker socket to launch containers. It has the following config structure: +Use the built-in configuration and run the `workflow.yaml` file from the `/my-workflow` +context directory with no input: -```yaml -image: - deployer_name: docker - connection: - host: # Docker connection string - cacert: # CA certificate for engine connection in PEM format - cert: # Client cert in PEM format - key: # Client key in PEM format - deployment: - container: # Container options, see https://docs.docker.com/engine/api/v1.41/#tag/Container/operation/ContainerCreate - host: # Host options, see https://docs.docker.com/engine/api/v1.41/#tag/Container/operation/ContainerCreate - network: # Network options, see https://docs.docker.com/engine/api/v1.41/#tag/Container/operation/ContainerCreate - platform: # Platform options, see https://docs.docker.com/engine/api/v1.41/#tag/Container/operation/ContainerCreate - - # Pull policy, similar to Kubernetes - imagePullPolicy: Always|IfNotPresent|Never - timeouts: - http: 15s +```bash +arcaflow --context /my-workflow ``` -**Note:** not all container options are supported. STDIN/STDOUT-related options are disabled. Some other options may not be implemented yet, but you will always get an error message explaining missing options. +Use a custom `my-config.yaml` configuration file and run the `my-workflow.yaml` workflow +using the `my-input.yaml` input file from the current directory: -## The Kubernetes deployer +```bash +arcaflow --config my-config.yaml --workflow my-workflow.yaml --input my-input.yaml +``` -The Kubernetes deployer deploys on a Kubernetes cluster. It has the following config structure: +Use a custom `config.yaml` configuration file and the default `workflow.yaml` file from +the `/my-workflow` context directory, and an `input.yaml` file from the current working +directory: -```yaml -image: - deployer_name: kubernetes - connection: - host: api.server.host - path: /api - username: foo - password: bar - serverName: tls.server.name - cert: PEM-encoded certificate - key: PEM-encoded key - cacert: PEM-encoded CA certificate - bearerToken: Bearer token for access - qps: queries per second - burst: burst value - deployment: - metadata: - # Add pod metadata here - spec: - # Add a normal pod spec here, plus the following option here: - pluginContainer: - # A single container configuration the plugin will run in. Do not specify the image, the engine will fill that. - timeouts: - http: 15s +```bash +arcaflow --context /my-workflow --config config.yaml --input ${PWD}/input.yaml ``` + +## Deployers + +Image-based deployers are used to deploy plugins to container platforms. Each deployer +has configuraiton parameters specific to its platform. These deployers are: + +- [Podman](https://github.com/arcalot/arcaflow-engine-deployer-podman) +- [Docker](https://github.com/arcalot/arcaflow-engine-deployer-docker) +- [Kubernetes](https://github.com/arcalot/arcaflow-engine-deployer-kubernetes) + +There is also a +[Python deployer](https://github.com/arcalot/arcaflow-engine-deployer-python) that +allows for running Python plugins directly instead of containerized. *Note that not all +Python plugins may work with the Python deployer, and any plugin dependencies must be +present on the target system.* diff --git a/arcaflow-basic-demo.gif b/arcaflow-basic-demo.gif new file mode 100644 index 00000000..31912ba3 Binary files /dev/null and b/arcaflow-basic-demo.gif differ diff --git a/cmd/arcaflow/main.go b/cmd/arcaflow/main.go index fe05d062..66f8533b 100644 --- a/cmd/arcaflow/main.go +++ b/cmd/arcaflow/main.go @@ -5,13 +5,15 @@ import ( "context" "flag" "fmt" + "os" + "os/signal" + "path/filepath" + "go.arcalot.io/log/v2" "go.flow.arcalot.io/engine" "go.flow.arcalot.io/engine/config" "go.flow.arcalot.io/engine/loadfile" "gopkg.in/yaml.v3" - "os" - "os/signal" ) // These variables are filled using ldflags during the build process with Goreleaser. @@ -51,61 +53,48 @@ func main() { Stdout: os.Stderr, }) + defaultConfig := ` +log: + level: info +deployers: + image: + deployer_name: podman + deployment: + imagePullPolicy: IfNotPresent +logged_outputs: + error: + level: info` + configFile := "" input := "" dir := "." workflowFile := "workflow.yaml" printVersion := false - flag.BoolVar(&printVersion, "version", printVersion, "Print Arcaflow Engine version and exit.") - flag.StringVar( - &configFile, - "config", - configFile, - "The Arcaflow configuration file to load, if any.", + const ( + versionUsage = "Print Arcaflow Engine version and exit." + configUsage = "The path to the Arcaflow configuration file to load, if any." + inputUsage = "The path to the workflow input file to load, if any." + contextUsage = "The path to the workflow directory to run from." + workflowUsage = "The path to the workflow file to load." ) - flag.StringVar( - &input, - "input", - input, - "The workflow input file to load. May be outside the workflow directory. If no input file is passed, "+ - "the workflow is assumed to take no input.", - ) - flag.StringVar( - &dir, - "context", - dir, - "The workflow directory to run from. Defaults to the current directory.", - ) - flag.StringVar( - &workflowFile, - "workflow", - workflowFile, - "The workflow file in the current directory to load. Defaults to workflow.yaml.", - ) - flag.Usage = func() { - _, _ = os.Stderr.Write([]byte(`Usage: arcaflow [OPTIONS] - -The Arcaflow engine will read the current directory and use it as a context -for executing the workflow. - -Options: + flag.BoolVar(&printVersion, "version", printVersion, versionUsage) + flag.StringVar(&configFile, "config", configFile, configUsage) + flag.StringVar(&input, "input", input, inputUsage) + flag.StringVar(&dir, "context", dir, contextUsage) + flag.StringVar(&workflowFile, "workflow", workflowFile, workflowUsage) - -version Print the Arcaflow Engine version and exit. - - -config FILENAME The Arcaflow configuration file to load, if any. - - -input FILENAME The workflow input file to load. May be outside the - workflow directory. If no input file is passed, - the workflow is assumed to take no input. - - -context DIRECTORY The workflow directory to run from. Defaults to the - current directory. - - -workflow FILENAME The workflow file in the current directory to load. - Defaults to workflow.yaml. -`)) + flag.Usage = func() { + w := flag.CommandLine.Output() + _, _ = w.Write( + []byte( + "Usage: " + os.Args[0] + " [OPTIONS]\n\n" + + "Arcaflow will read file paths relative to the context directory.\n\n", + ), + ) + flag.PrintDefaults() } + flag.Parse() if printVersion { @@ -116,7 +105,7 @@ Options: "Commit: %s\n"+ "Date: %s\n"+ "Apache 2.0 license\n"+ - "Copyright (c) Arcalot Contributors", + "Copyright (c) Arcalot Contributors\n", version, commit, date, ) return @@ -125,31 +114,44 @@ Options: var err error requiredFiles := map[string]string{ - RequiredFileKeyConfig: configFile, - RequiredFileKeyInput: input, RequiredFileKeyWorkflow: workflowFile, } - fileCtx, err := loadfile.NewFileCacheUsingContext(dir, requiredFiles) - if err != nil { - flag.Usage() - tempLogger.Errorf("context path resolution failed %s (%v)", dir, err) - os.Exit(ExitCodeInvalidData) + if len(configFile) != 0 { + requiredFiles[RequiredFileKeyConfig] = configFile } - configFilePath, err := fileCtx.AbsPathByKey(RequiredFileKeyConfig) + if len(input) != 0 { + requiredFiles[RequiredFileKeyInput] = input + } + + fileCtx, err := loadfile.NewFileCacheUsingContext(dir, requiredFiles) if err != nil { - tempLogger.Errorf("Unable to find configuration file %s (%v)", configFile, err) + tempLogger.Errorf("Context path resolution failed %s (%v)", dir, err) flag.Usage() os.Exit(ExitCodeInvalidData) } var configData any - configData, err = loadYamlFile(configFilePath) - if err != nil { - tempLogger.Errorf("Failed to load configuration file %s (%v)", configFile, err) - flag.Usage() - os.Exit(ExitCodeInvalidData) + if len(configFile) == 0 { + if err := yaml.Unmarshal([]byte(defaultConfig), &configData); err != nil { + tempLogger.Errorf("Failed to load default configuration", err) + os.Exit(ExitCodeInvalidData) + } + } else { + configFilePath, err := fileCtx.AbsPathByKey(RequiredFileKeyConfig) + if err != nil { + tempLogger.Errorf("Unable to find configuration file %s (%v)", configFile, err) + flag.Usage() + os.Exit(ExitCodeInvalidData) + } + + configData, err = loadYamlFile(configFilePath) + if err != nil { + tempLogger.Errorf("Failed to load configuration file %s (%v)", configFile, err) + flag.Usage() + os.Exit(ExitCodeInvalidData) + } } cfg, err := config.Load(configData) @@ -178,8 +180,17 @@ Options: } var inputData []byte - if input != "" { - inputData, err = os.ReadFile(input) + if len(input) == 0 { + inputData = []byte("{}") + } else { + inputFilePath, err := fileCtx.AbsPathByKey(RequiredFileKeyInput) + if err != nil { + tempLogger.Errorf("Unable to find input file %s (%v)", input, err) + flag.Usage() + os.Exit(ExitCodeInvalidData) + } + + inputData, err = os.ReadFile(filepath.Clean(inputFilePath)) if err != nil { logger.Errorf("Failed to read input file %s (%v)", input, err) flag.Usage()