Skip to content

Commit

Permalink
Merge pull request #2 from gabemontero/initial-code-drop
Browse files Browse the repository at this point in the history
DEVAI-140: Initial code drop of CLI code
  • Loading branch information
gabemontero authored Dec 2, 2024
2 parents 86c6dc2 + 3dc0dae commit f35cf52
Show file tree
Hide file tree
Showing 5,335 changed files with 1,773,935 additions and 2 deletions.
The diff you're trying to view is too large. We only load the first 3000 changed files.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,4 @@ go.work.sum

# env file
.env
_output
51 changes: 51 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
APP = bac
OUTPUT_DIR ?= _output

CMD = ./cmd/$(APP)/...
PKG = ./pkg/...

BIN ?= $(OUTPUT_DIR)/$(APP)
KUBECTL_BIN ?= $(OUTPUT_DIR)/kubectl-$(APP)

GO_FLAGS ?= -mod=vendor
GO_TEST_FLAGS ?= -race -cover

GO_PATH ?= $(shell go env GOPATH)
GO_CACHE ?= $(shell go env GOCACHE)

INSTALL_LOCATION ?= /usr/local/bin

ARGS ?=

.EXPORT_ALL_VARIABLES:

.PHONY: $(BIN)
$(BIN):
go build $(GO_FLAGS) -o $(BIN) $(CMD)

build: $(BIN)

install: build
install -m 0755 $(BIN) $(INSTALL_LOCATION)

# creates a kubectl prefixed binary, "kubectl-$APP", and when installed under $PATH, will be
# visible as "kubectl $APP".
# See https://kubernetes.io/docs/tasks/extend-kubectl/kubectl-plugins/
# Not employing krew at this time
.PHONY: kubectl
kubectl: BIN = $(KUBECTL_BIN)
kubectl: $(BIN)

kubectl-install: BIN = $(KUBECTL_BIN)
kubectl-install: kubectl install

clean:
rm -rf "$(OUTPUT_DIR)"

# runs all tests
test: test-unit

.PHONY: test-unit
test-unit:
go test $(GO_FLAGS) $(GO_TEST_FLAGS) $(CMD) $(PKG) $(ARGS)

49 changes: 47 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,47 @@
# rhdh-ai-catalog-cli
A CLI that facilitates injecting model metadata from various sources into the Backstage / Red Hat Data Hub Catalog
# Backstage AI related administration CLI - bac

A CLI that facilitates injecting AI model metadata from various sources into the Backstage Catalog

## Contributing

All contributions are welcome. The [Apache 2 license](http://www.apache.org/licenses/) is used and does not require any
contributor agreement to submit patches. That said, the preference at this time for issue tracking is not GitHub issues
in this repository.

Rather, visit the team's [RHDHPAI Jira project and the 'model-registry-bridge' component](https://issues.redhat.com/issues/?jql=project%20%3D%20RHDHPAI%20AND%20component%20%3D%20model-registry-bridge).

As the team makes sufficient progress on the basic fit and finish items in the [roadmap](docs/roadmap.md), and sufficiently
progresses beyond the prototype phase, we'll revisit the use of GitHub issues in this repository.

See [the development guide](docs/DEVELOPMENT.md) for details on how to build and test any contributions you make.

## Usage

At a high level, the `bac` CLI

- Provides for the generation of YAML formatted definitions of Backstage `Components`, `Resources`, and `APIs` catalog entities by accessing external systems that provide AI model metadata.
- Which external systems are supported is expected to grow over time, at least in the short term.
- Once that YAML information is stored in a HTTP accessible file, the `bac` CLI then provides commands to instruct a specific Backstage instance to import those entities into its catalog. This will show up as a Backstage `Location` in the catalog, where the `Location` is a parent of the `Components`, `Resources` and `APIs`.
- Those `Components`, `Resources`, and `APIs` will have specific AI related `types` which will allow for distinguishing from other `Components`, `Resources` and `APIs` in the catalog.
- It allows for the deletion of Backstage `Locations` and any `Components`, `Resources`, and `APIs` defined by that `Location`.
- Lastly, the `bac` CLI allows for retrieving any AI related `Components`, `Resources` and `APIs`.

To receive detailed usage information and example invocations, after building the `bac` executable, you can run
```shell
bac help
```

This invocation will also provide the current list of subcommands. Similarly, running
```shell
bac help <subcommand>
bac help <subcommand> <subcommand>
```
will provide usage information, example invocations, optional flags, and additional subcommands for the current list of subcommands.

## Potential tl;dr

First, our [background document](docs/background.md) gets into the scenarios and personas we are targeting with this CLI,
as well as rationale for the syntax, language(s), and the like.

Then, our [roadmap document](docs/roadmap.md) provides a snapshot of the more immediate changes we have planned, with
Jira references when ideas reach sufficient priority to warrant official tracking.
71 changes: 71 additions & 0 deletions cmd/bac/main.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
package main

import (
goflag "flag"
"fmt"
"github.com/redhat-ai-dev/rhdh-ai-catalog-cli/pkg/cmd/cli"
"github.com/redhat-ai-dev/rhdh-ai-catalog-cli/pkg/util"
"github.com/spf13/pflag"
_ "k8s.io/client-go/plugin/pkg/client/auth"
"k8s.io/klog/v2"
"os"
)

var hiddenLogFlags = []string{
"add_dir_header",
"alsologtostderr",
"log_backtrace_at",
"log_dir",
"log_file",
"log_file_max_size",
"logtostderr",
"one_output",
"skip_headers",
"skip_log_headers",
"stderrthreshold",
"v",
"vmodule",
}

func main() {
if err := initGoFlags(); err != nil {
fmt.Fprintf(os.Stderr, "ERROR: %v\n", err)
os.Exit(1)
}
initPFlags()

rootCmd := cli.NewCmd()
if err := rootCmd.Execute(); err != nil {
klog.Errorf("ERROR: %v\n", err)
os.Exit(1)
}
}

// initGoFlags initializes the flag sets for klog.
// Any flags for "-h" or "--help" are ignored because pflag will show the usage later with all subcommands.
func initGoFlags() error {
flagset := goflag.NewFlagSet(util.ApplicationName, goflag.ContinueOnError)
goflag.CommandLine = flagset
klog.InitFlags(flagset)

args := []string{}
for _, arg := range os.Args[1:] {
if arg != "-h" && arg != "--help" {
args = append(args, arg)
}
}
return flagset.Parse(args)
}

// initPFlags initializes the pflags used by Cobra subcommands.
func initPFlags() {
flags := pflag.NewFlagSet(util.ApplicationName, pflag.ExitOnError)
flags.AddGoFlagSet(goflag.CommandLine)
pflag.CommandLine = flags

for _, flag := range hiddenLogFlags {
if err := flags.MarkHidden(flag); err != nil {
panic(err)
}
}
}
43 changes: 43 additions & 0 deletions docs/DEVELOPMENT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Contributing

## Dependencies

Currently:
- `go`
- `make`
- `git`

See the [go.mod file](go.mod) for the current version of Golang used in implementing this CLI.

Implementations in other languages may occur at a later date. The idea being implementations in other languages
may help with engagement in one upstream community versus another.

## Build

To simply build the `bac` binary:
```shell
make build
```

To build the binary with a `kubectl-` prefix to enable basic [kubectl plugin support](https://kubernetes.io/docs/tasks/extend-kubectl/kubectl-plugins/):
```shell
make kubectl
```

To install the `bac` binary to a directory in your execution path (default is `/usr/local/bin` but can be changed with the
`INSTALL_LOCATION` environment variable):
```shell
make install
```

And similarly, to install the `kubectl-bac` binary:
```shell
make kubectl-install
```

## Test

Currently only Golang unit tests are present:
```shell
make test
```
89 changes: 89 additions & 0 deletions docs/background.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Background

## Explosion of 'AI Model Repositories'

A list that starts off incomplete, and will most likely always be a subset, of places where "AI Model Metadata" is
accessible in some form or fashion:
- HuggingFace
- Ollama
- Kubeflow Model Registry
- KServe CRDs in K8s clusters
- MLFlow Model Registry
- OCI image registries like quay.io, registry.redhat.io, or docker.io
- API Gateways like 3Scale or Kong

Each have some form of REST API. Most if not all have a CLI that interacts with said REST API.

Building some form of normalization around taking data from those sources and constructing Backstage catalog
artifacts or "Entities" (where, by the way, the Backstage Catalog also has a REST API).

## Developer exploration vs. prescription from the enterprise

So "who" should be making the decision of which AI Models land in a Backstage instance to facilitate the use of AI
in applications developed with Backstage?

Whether the answer includes
- any developer on the given Backstage instance (where the developer team may have set it up)
- select developers on the instance
- or only the DevOps or MLOps or Platform engineers who set up the Backstage instances, where those folks are different people than the developers using Backstage

can potentially result in different preference for "how" the Catalog is updated.

## Personas and their scenarios

Boiling that down into an agile story description:

As a Platform or MLOPs engineer, I want to administrate the Backstage Catalog from the command line so that I can better automate administration of Backstage for AI related application development.

As a DevOps engineer, I want to administrate the Backstage Catalog from the command line so that I can better automate verification pipelines for testing of Backstage’s AI related features.

## Syntax

Both [this UXD CLI guidelines reference](https://www.uxd-hub.com/entries/design/cli-guidelines) and the relative success (and initial contributors' background) in recent years with various CLI in the cloud computing space:

- `kubectl`
- `docker`
- `oc`
- `podman`
- `tkn`
- `aws`
- `rosa`
- `shp`

Generally speaking, you'll see either some form of:

- "cmd verb subject args" pattern ... `kubectl get pods ...` or `rosa create cluster` or `oc delete routes`
- "cmd verb-subject args" pattern ... `oc new-app ...` or `rosa list-clusters` or `oc new-build` or `oc cancel-build` or `oc import-image`
- and sometimes even "cmd subject verb args" .... `tkn pr list ...` or `shp build create ...`

When considering Backstage's current Catalog REST API, particularly some lack of symmetry between which verbs apply to which subjects:

- You can only import to the ‘location’ REST with a URL pointing to a YAML document containing the definition of multiple subjects (Components, Resource, API, pointers to TechDocs)
- But you can get/delete on all the subject types via REST api that don’t include the subject name in REST URI

The (initial) decision: the prior art, a mixture of some "cmd verb-subject .." with "cmd verb subject" where it makes sense.

So we have:

- `bac new-model kserve` for generating Backstage Catalog Entities in YAML format based on KServ CRD instances on a running Kubernetes cluster.
- `bac new-model kubeflow` for generating Backstage Catalog Entities in YAML format based information pulled from the Kubeflow Model Registry
- (with more sources to be added to `bac new-model`, see the [roadmap](roadmap.md))
- then after storing the YAML from `bac new-model` in a HTTP accessible file, you call `bac import-model <URL of that file>` to create a new Backstage `Location` with the entities defined in the YAML file referenced by the URL in a Backstage instance's catalog. The output of that command will include the ID for the `Location`
- later on, if need be, you can run `bac delete-model <ID from bac import-model>` to remove the `Location` and associated entities.
- Lastly, there is a `bac get [locations|entities|components|resources|apis]` command for querying the Backstage Catalog for AI related entities (which we designate with special values for the `.spec.type` field)

The current plan is not to update this list when say everytime we add a new model registry option to `bac new-model`. However, if we pivot on the
syntax philosophy in a significant way, we'll update this section.

## Implementation language(s)

The initial drop of this CLI is Golang. Simply put, initial team expertise, plus some potential ideas around Kubernetes based
services to help coordinate AI development tooling and AI deployments for development, staging, and production, lead to
this decision. Similarly, the OCI ecosystem, most notably the `docker` and `podman`, CLI are also in Golang.

That said, the Backstage ecosystem is TypeScript based. And a TypeScript CLI could serve as a building block to
a Backstage plugin. And the AI space already has CLI written in several languages, most notably Python. Lastly, API
Gateway CLI further expands the spectrum. Ruby, vanilla JavaScript as well as TypeScript, based CLI exist in that space.

So, this project will entertain, or at least not dismiss, the notion of versions in more than one language to facilitate
plays in those different ecosystems.
Loading

0 comments on commit f35cf52

Please sign in to comment.