Merge pull request #206 from janhq/chore/CLI

chore: update CLI
janhq · Oct 7, 2024 · c3570b7 · c3570b7
2 parents 51848d9 + 7f86f9e
commit c3570b7
Show file tree

Hide file tree

Showing 10 changed files with 50 additions and 126 deletions.
diff --git a/docs/cli/chat.mdx b/docs/cli/chat.mdx
@@ -14,8 +14,10 @@ import TabItem from "@theme/TabItem";
 # `cortex chat`
 :::info
 This CLI command calls the following API endpoint:
-- [Start Model](/api-reference#tag/models/post/v1/models/{modelId}/start)
-- [Chat Completions](/api-reference#tag/inference/post/v1/chat/completions)
+- [Download Model](/api-reference#tag/models/post/v1/models/pull) (The command only calls this endpoint if the specified model is not downloaded yet.)
+- Install Engine (The command only calls this endpoint if the specified engine is not downloaded yet.)
+- [Start Model](/api-reference#tag/models/post/v1/models/start)
+- [Chat Completions](/api-reference#tag/inference/post/v1/chat/completions) (The command makes a call to this endpoint if the `-c` option is used.)
 :::
 
 This command starts a chat session with a specified model, allowing you to interact directly with it through an interactive chat interface.

diff --git a/docs/cli/engines/get.mdx b/docs/cli/engines/get.mdx
@@ -16,18 +16,15 @@ This command returns an engine detail defined by an engine `name`.
 ## Usage
 
 ```bash
-cortex engines get <name>
+cortex engines get <engine_name>
 ```
 For example, it returns the following:
 ```bash
-┌─────────────┬────────────────────────────────────────────────────────────────────────────┐
-│ (index)     │ Values                                                                     │
-├─────────────┼────────────────────────────────────────────────────────────────────────────┤
-│ name        │ 'onnx'                                                              │
-│ description │ 'This extension enables chat completion API calls using the Cortex engine' │
-│ version     │ '0.0.1'                                                                    │
-│ productName │ 'Cortex Inference Engine'                                                  │
-└─────────────┴────────────────────────────────────────────────────────────────────────────┘
++-----------+-------------------+---------+----------------------------+--------+
+| Name      | Supported Formats | Version | Variant                    | Status |
++-----------+-------------------+---------+----------------------------+--------+
+| llama-cpp | GGUF              | 0.1.34  | linux-amd64-avx2-cuda-12-0 | Ready  |
++-----------+-------------------+---------+----------------------------+--------+
 ```
 :::info
 To get an engine name, run the [`engines list`](/docs/cli/engines/list) command first.
@@ -38,6 +35,6 @@ To get an engine name, run the [`engines list`](/docs/cli/engines/list) command
 
 | Option            | Description                                           | Required | Default value | Example         |
 |-------------------|-------------------------------------------------------|----------|---------------|-----------------|
-| `name`        | The name of the engine that you want to retrieve.     | Yes      | -             | `llamacpp`|
+| `name`        | The name of the engine that you want to retrieve.     | Yes      | -             | `llama-cpp`|
 | `-h`, `--help`    | Display help information for the command.             | No       | -             | `-h`        |
 
diff --git a/docs/cli/engines/index.mdx b/docs/cli/engines/index.mdx
@@ -114,7 +114,7 @@ To get an engine name, run the [`engines list`](/docs/cli/engines/list) command
 
 | Option            | Description                                           | Required | Default value | Example         |
 |-------------------|-------------------------------------------------------|----------|---------------|-----------------|
-| `engine_name`        | The name of the engine that you want to retrieve.     | Yes      | -             | `llamacpp`|
+| `engine_name`        | The name of the engine that you want to retrieve.     | Yes      | -             | `llama-cpp`|
 | `-h`, `--help`    | Display help information for the command.             | No       | -             | `-h`        |
 
 ## `cortex engines list`
@@ -159,18 +159,15 @@ You can use the `--verbose` flag to display more detailed output of the internal
 
 For example, it returns the following:
 ```bash
-+---------+---------------------+-------------------------------------------------------------------------------+---------+------------------------------+-----------------+
-| (Index) |         name        |                                  description                                  | version |         product name         |      status     |
-+---------+---------------------+-------------------------------------------------------------------------------+---------+------------------------------+-----------------+
-|    1    | onnx         | This extension enables chat completion API calls using the Onnx engine        | 0.0.1
-  | Onnx Inference Engine        | not_initialized |
-+---------+---------------------+-------------------------------------------------------------------------------+---------+------------------------------+-----------------+
-|    2    | llamacpp     | This extension enables chat completion API calls using the LlamaCPP engine    | 0.0.1
-  | LlamaCPP Inference Engine    | ready           |
-+---------+---------------------+-------------------------------------------------------------------------------+---------+------------------------------+-----------------+
-|    3    | tensorrt-llm | This extension enables chat completion API calls using the TensorrtLLM engine | 0.0.1
-  | TensorrtLLM Inference Engine | not_initialized |
-+---------+---------------------+-------------------------------------------------------------------------------+---------+------------------------------+-----------------+
++---+--------------+-------------------+---------+----------------------------+---------------+
+| # | Name         | Supported Formats | Version | Variant                    | Status        |
++---+--------------+-------------------+---------+----------------------------+---------------+
+| 1 | onnxruntime  | ONNX              |         |                            | Incompatible  |
++---+--------------+-------------------+---------+----------------------------+---------------+
+| 2 | llama-cpp    | GGUF              | 0.1.34  | linux-amd64-avx2-cuda-12-0 | Ready         |
++---+--------------+-------------------+---------+----------------------------+---------------+
+| 3 | tensorrt-llm | TensorRT Engines  |         |                            | Not Installed |
++---+--------------+-------------------+---------+----------------------------+---------------+
 ```
 
 **Options**:
@@ -186,9 +183,9 @@ This CLI command calls the following API endpoint:
 - [Init Engine](/api-reference#tag/engines/post/v1/engines/{name}/init)
 :::
 This command downloads the required dependencies and installs the engine within Cortex. Currently, Cortex supports three engines:
-- `Llama.cpp`
-- `Onnx`
-- `Tensorrt-llm`
+- `llama-cpp`
+- `onnxruntime`
+- `tensorrt-llm`
 
 **Usage**:
 :::info
@@ -224,10 +221,10 @@ You can use the `--verbose` flag to display more detailed output of the internal
 For Example:
 ```bash
 ## Llama.cpp engine
-cortex engines install llamacpp
+cortex engines install llama-cpp
 
 ## ONNX engine
-cortex engines install onnx
+cortex engines install onnxruntime
 
 ## Tensorrt-LLM engine
 cortex engines install tensorrt-llm
@@ -279,10 +276,10 @@ You can use the `--verbose` flag to display more detailed output of the internal
 For Example:
 ```bash
 ## Llama.cpp engine
-cortex engines uninstall llamacpp
+cortex engines uninstall llama-cpp
 
 ## ONNX engine
-cortex engines uninstall onnx
+cortex engines uninstall onnxruntime
 
 ## Tensorrt-LLM engine
 cortex engines uninstall tensorrt-llm

diff --git a/docs/cli/engines/init.mdx b/docs/cli/engines/init.mdx
@@ -19,10 +19,10 @@ cortex engines init [options] <name>
 For Example:
 ```bash
 ## Llama.cpp engine
-cortex engines init llamacpp
+cortex engines init llama-cpp
 
 ## ONNX engine
-cortex engines init onnx
+cortex engines init onnxruntime
 
 ## Tensorrt-LLM engine
 cortex engines init tensorrt-llm

diff --git a/docs/cli/engines/list.mdx b/docs/cli/engines/list.mdx
@@ -23,9 +23,9 @@ For example, it returns the following:
 +---+---------------+--------------------+---------+--------------+
 | # | Name          | Supported Formats  | Version | Status       |
 +---+---------------+--------------------+---------+--------------+
-| 1 | ONNXRuntime   | ONNX               | 0.0.1   | Incompatible |
-| 2 | llama.cpp     | GGUF               | 0.0.1   | Ready        |
-| 3 | TensorRT-LLM  | TensorRT Engines   | 0.0.1   | Incompatible |
+| 1 | onnxruntime   | ONNX               | 0.0.1   | Incompatible |
+| 2 | llama-cpp     | GGUF               | 0.0.1   | Ready        |
+| 3 | tensorrt-llm  | TensorRT Engines   | 0.0.1   | Incompatible |
 +---+---------------+--------------------+---------+--------------+
 ```
 

diff --git a/docs/cli/models/index.mdx b/docs/cli/models/index.mdx
@@ -323,57 +323,6 @@ This command uses a `model_id` from the model that you have started before.
 | `model_id`                | The identifier of the model you want to stop.                               | Yes      | -                    | `mistral`       |
 | `-h`, `--help`            | Display help information for the command.                                   | No       | -                    | `-h`               |
 
-## `cortex models update`
-:::info
-This CLI command calls the following API endpoint:
-- [Update Model](/api-reference#tag/models/patch/v1/models/{model})
-:::
-This command updates a model configuration defined by a `model_id`.
-
-
-
-**Usage**:
-:::info
-You can use the `--verbose` flag to display more detailed output of the internal processes. To apply this flag, use the following format: `cortex --verbose [subcommand]`.
-:::
-<Tabs>
-  <TabItem value="MacOs/Linux" label="MacOs/Linux">
-  ```sh
-  # Stable
-  cortex models update [options] <model_id>
-
-  # Beta
-  cortex-beta models update [options] <model_id>
-
-  # Nightly
-  cortex-nightly models update [options] <model_id>
-  ```
-  </TabItem>
-  <TabItem value="Windows" label="Windows">
-  ```sh
-  # Stable
-  cortex.exe models update [options] <model_id>
-
-  # Beta
-  cortex-beta.exe models update [options] <model_id>
-
-  # Nightly
-  cortex-nightly.exe models update [options] <model_id>
-  ```
-  </TabItem>
-</Tabs>
-
-:::info
-This command uses a `model_id` from the model that you have downloaded or available in your file system.
-:::
-**Options**:
-
-| Option                      | Description                                                                                           | Required | Default value        | Example                                                   |
-|-----------------------------|-------------------------------------------------------------------------------------------------------|----------|----------------------|-----------------------------------------------------------|
-| `model_id`                  | The identifier of the model you want to update.                                                       | Yes      | -                    | `mistral`                                          |
-| `-c`, `--options <options...>` | Specify the options to update the model. Syntax: `-c option1=value1 option2=value2`.  | Yes      | -                    | `-c max_tokens=100 temperature=0.5`                        |
-| `-h`, `--help`              | Display help information for the command.                                                             | No       | -                    | `-h`                                                  |
-
 ## `cortex models delete`
 :::info
 This CLI command calls the following API endpoint:

diff --git a/docs/cli/models/list.md b/docs/cli/models/list.md
@@ -20,17 +20,13 @@ cortex models list [options]
 ```
 For example, it returns the following:
 ```bash
-┌─────────┬───────────────────────────────────────────────┬──────────────────────────────┬───────────┐
-│ (index) │ id                                            │ engine                       │ version   │
-├─────────┼───────────────────────────────────────────────┼──────────────────────────────┼───────────┤
-│ 0       │ 'gpt-3.5-turbo'                               │ 'openai'                     │ 1         │
-│ 1       │ 'gpt-4o'                                      │ 'openai'                     │ 1         │
-│ 2       │ 'llama3:onnx'                                 │ 'onnx'                │ 1         │
-│ 3       │ 'llama3'                                      │ 'llamacpp'            │ undefined │
-│ 4       │ 'openhermes-2.5:tensorrt-llm-windows-ada'     │ 'tensorrt-llm'        │ 1         │
-│ 5       │ 'openhermes-2.5:tensorrt-llm'                 │ 'tensorrt-llm'        │ 1         │
-│ 6       │ 'tinyllama'                                   │ 'llamacpp'            │ undefined │
-└─────────┴───────────────────────────────────────────────┴──────────────────────────────┴───────────┘
++---------+----------------+----------------+-----------------+---------+
+| (Index) |       ID       |   model alias  |      engine     | version |
++---------+----------------+----------------+-----------------+---------+
+|    1    | llama3:gguf    | llama3:gguf    | llama-cpp       |    1    |
++---------+----------------+----------------+-----------------+---------+
+|    2    | tinyllama:gguf | tinyllama:gguf | llama-cpp       |    1    |
++---------+----------------+----------------+-----------------+---------+
 
 ```
 

diff --git a/docs/cli/ps.mdx b/docs/cli/ps.mdx
@@ -12,11 +12,7 @@ import TabItem from "@theme/TabItem";
 :::
 
 # `cortex ps`
-:::info
-This CLI command calls the following API endpoint:
-- [Get Model Status](/api-reference#tag/system/get/v1/system/events/model)
-- [Get Resource Status](/api-reference#tag/system/get/v1/system/events/resources)
-:::
+
 This command shows the running model and its status.
 
 
@@ -56,22 +52,11 @@ You can use the `--verbose` flag to display more detailed output of the internal
 For example, it returns the following table:
 
 ```bash
-√ Dependencies loaded in 2882ms
-√ Getting models...
-√ Running PS command...
-┌─────────┬──────────────────────┬───────────────────┬───────────┬──────────┬─────┬──────┐
-│ (index) │ modelId              │ engine            │ status    │ duration │ ram │ vram │
-├─────────┼──────────────────────┼───────────────────┼───────────┼──────────┼─────┼──────┤
-│ 0       │ 'janhq/tinyllama/1b' │ 'llamacpp' │ 'running' │ '7s'     │ '-' │ '-'  │
-└─────────┴──────────────────────┴───────────────────┴───────────┴──────────┴─────┴──────┘
-√ API server is offline
-## The ps command also provides information on the percentage of system resources being used.
-┌─────────┬───────────┬──────────────┬────────────────────────┐
-│ (index) │ CPU Usage │ Memory Usage │ VRAM                   │
-├─────────┼───────────┼──────────────┼────────────────────────┤
-│ 0       │ '4.13%'   │ '83.11%'     │                        │
-│ 1       │           │              │ [ [Object], [Object] ] │
-└─────────┴───────────┴──────────────┴────────────────────────┘
++----------------+-----------+----------+-----------+-----------+
+| Model          | Engine    | RAM      | VRAM      | Up time   |
++----------------+-----------+----------+-----------+-----------+
+| tinyllama:gguf | llama-cpp | 35.16 MB | 601.02 MB | 5 seconds |
++----------------+-----------+----------+-----------+-----------+
 ```
 ## Options
 

diff --git a/docs/cli/run.mdx b/docs/cli/run.mdx
@@ -14,13 +14,12 @@ import TabItem from "@theme/TabItem";
 # `cortex run`
 :::info
 This CLI command calls the following API endpoint:
-- [Download Model](/api-reference#tag/models/post/v1/models/{modelId}/pull) (The command only calls this endpoint if the specified model is not downloaded yet.)
+- [Download Model](/api-reference#tag/models/post/v1/models/pull) (The command only calls this endpoint if the specified model is not downloaded yet.)
 - Install Engine (The command only calls this endpoint if the specified engine is not downloaded yet.)
-- [Start Model](/api-reference#tag/models/post/v1/models/{modelId}/start)
-- [Chat Completions](/api-reference#tag/inference/post/v1/chat/completions) (The command makes a call to this endpoint if the `-c` option is used.)
+- [Start Model](/api-reference#tag/models/post/v1/models/start)
 :::
 
-This command facilitates the initiation of an interactive chat shell with a specified machine-learning model.
+This command facilitates the initiation of starting a specified machine-learning model.
 
 ## Usage
 :::info
@@ -80,4 +79,3 @@ This command downloads and installs the model if not already available in your f
 1. [`cortex pull`](/docs/cli/models/): This command pulls the specified model if the model is not yet downloaded.
 2. [`cortex engines install`](/docs/cli/engines/): This command installs the specified engines if not yet downloaded.
 3. [`cortex models start`](/docs/cli/models/): This command starts the specified model, making it active and ready for interactions.
-4. [`cortex chat`](/docs/cli/chat): Following model activation, this command opens an interactive chat shell where users can directly communicate with the model.
diff --git a/docs/cli/start.mdx b/docs/cli/start.mdx
@@ -54,7 +54,7 @@ You can use the `--verbose` flag to display more detailed output of the internal
 | Option                       | Description                               | Required | Default value | Example                       |
 | ---------------------------- | ----------------------------------------- | -------- | ------------- | ----------------------------- |
 | `-h`, `--help`               | Display help information for the command.  | No       | -             | `-h`                          |
-| `-p`, `--port <port>`        | Port to serve the application.             | No       | -             | `-p 3928`                     |
+| `-p`, `--port <port>`        | Port to serve the application.             | No       | -             | `-p 39281`                     |
 <!-- | `-a`, `--address <address>`  | Address to use.                            | No       | -             | `-a 192.168.1.1`              | -->
 <!--| `--dataFolder <dataFolder>`  | Set the data folder directory             | No       | -             | `--dataFolder /path/to/data`  | -->