Skip to content
This repository has been archived by the owner on Oct 14, 2024. It is now read-only.

Commit

Permalink
Merge pull request #206 from janhq/chore/CLI
Browse files Browse the repository at this point in the history
chore: update CLI
  • Loading branch information
vansangpfiev authored Oct 7, 2024
2 parents 51848d9 + 7f86f9e commit c3570b7
Show file tree
Hide file tree
Showing 10 changed files with 50 additions and 126 deletions.
6 changes: 4 additions & 2 deletions docs/cli/chat.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,10 @@ import TabItem from "@theme/TabItem";
# `cortex chat`
:::info
This CLI command calls the following API endpoint:
- [Start Model](/api-reference#tag/models/post/v1/models/{modelId}/start)
- [Chat Completions](/api-reference#tag/inference/post/v1/chat/completions)
- [Download Model](/api-reference#tag/models/post/v1/models/pull) (The command only calls this endpoint if the specified model is not downloaded yet.)
- Install Engine (The command only calls this endpoint if the specified engine is not downloaded yet.)
- [Start Model](/api-reference#tag/models/post/v1/models/start)
- [Chat Completions](/api-reference#tag/inference/post/v1/chat/completions) (The command makes a call to this endpoint if the `-c` option is used.)
:::

This command starts a chat session with a specified model, allowing you to interact directly with it through an interactive chat interface.
Expand Down
17 changes: 7 additions & 10 deletions docs/cli/engines/get.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -16,18 +16,15 @@ This command returns an engine detail defined by an engine `name`.
## Usage

```bash
cortex engines get <name>
cortex engines get <engine_name>
```
For example, it returns the following:
```bash
┌─────────────┬────────────────────────────────────────────────────────────────────────────┐
│ (index) │ Values │
├─────────────┼────────────────────────────────────────────────────────────────────────────┤
│ name │ 'onnx'
│ description │ 'This extension enables chat completion API calls using the Cortex engine'
│ version │ '0.0.1'
│ productName │ 'Cortex Inference Engine'
└─────────────┴────────────────────────────────────────────────────────────────────────────┘
+-----------+-------------------+---------+----------------------------+--------+
| Name | Supported Formats | Version | Variant | Status |
+-----------+-------------------+---------+----------------------------+--------+
| llama-cpp | GGUF | 0.1.34 | linux-amd64-avx2-cuda-12-0 | Ready |
+-----------+-------------------+---------+----------------------------+--------+
```
:::info
To get an engine name, run the [`engines list`](/docs/cli/engines/list) command first.
Expand All @@ -38,6 +35,6 @@ To get an engine name, run the [`engines list`](/docs/cli/engines/list) command

| Option | Description | Required | Default value | Example |
|-------------------|-------------------------------------------------------|----------|---------------|-----------------|
| `name` | The name of the engine that you want to retrieve. | Yes | - | `llamacpp`|
| `name` | The name of the engine that you want to retrieve. | Yes | - | `llama-cpp`|
| `-h`, `--help` | Display help information for the command. | No | - | `-h` |

37 changes: 17 additions & 20 deletions docs/cli/engines/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ To get an engine name, run the [`engines list`](/docs/cli/engines/list) command

| Option | Description | Required | Default value | Example |
|-------------------|-------------------------------------------------------|----------|---------------|-----------------|
| `engine_name` | The name of the engine that you want to retrieve. | Yes | - | `llamacpp`|
| `engine_name` | The name of the engine that you want to retrieve. | Yes | - | `llama-cpp`|
| `-h`, `--help` | Display help information for the command. | No | - | `-h` |

## `cortex engines list`
Expand Down Expand Up @@ -159,18 +159,15 @@ You can use the `--verbose` flag to display more detailed output of the internal

For example, it returns the following:
```bash
+---------+---------------------+-------------------------------------------------------------------------------+---------+------------------------------+-----------------+
| (Index) | name | description | version | product name | status |
+---------+---------------------+-------------------------------------------------------------------------------+---------+------------------------------+-----------------+
| 1 | onnx | This extension enables chat completion API calls using the Onnx engine | 0.0.1
| Onnx Inference Engine | not_initialized |
+---------+---------------------+-------------------------------------------------------------------------------+---------+------------------------------+-----------------+
| 2 | llamacpp | This extension enables chat completion API calls using the LlamaCPP engine | 0.0.1
| LlamaCPP Inference Engine | ready |
+---------+---------------------+-------------------------------------------------------------------------------+---------+------------------------------+-----------------+
| 3 | tensorrt-llm | This extension enables chat completion API calls using the TensorrtLLM engine | 0.0.1
| TensorrtLLM Inference Engine | not_initialized |
+---------+---------------------+-------------------------------------------------------------------------------+---------+------------------------------+-----------------+
+---+--------------+-------------------+---------+----------------------------+---------------+
| # | Name | Supported Formats | Version | Variant | Status |
+---+--------------+-------------------+---------+----------------------------+---------------+
| 1 | onnxruntime | ONNX | | | Incompatible |
+---+--------------+-------------------+---------+----------------------------+---------------+
| 2 | llama-cpp | GGUF | 0.1.34 | linux-amd64-avx2-cuda-12-0 | Ready |
+---+--------------+-------------------+---------+----------------------------+---------------+
| 3 | tensorrt-llm | TensorRT Engines | | | Not Installed |
+---+--------------+-------------------+---------+----------------------------+---------------+
```

**Options**:
Expand All @@ -186,9 +183,9 @@ This CLI command calls the following API endpoint:
- [Init Engine](/api-reference#tag/engines/post/v1/engines/{name}/init)
:::
This command downloads the required dependencies and installs the engine within Cortex. Currently, Cortex supports three engines:
- `Llama.cpp`
- `Onnx`
- `Tensorrt-llm`
- `llama-cpp`
- `onnxruntime`
- `tensorrt-llm`

**Usage**:
:::info
Expand Down Expand Up @@ -224,10 +221,10 @@ You can use the `--verbose` flag to display more detailed output of the internal
For Example:
```bash
## Llama.cpp engine
cortex engines install llamacpp
cortex engines install llama-cpp

## ONNX engine
cortex engines install onnx
cortex engines install onnxruntime

## Tensorrt-LLM engine
cortex engines install tensorrt-llm
Expand Down Expand Up @@ -279,10 +276,10 @@ You can use the `--verbose` flag to display more detailed output of the internal
For Example:
```bash
## Llama.cpp engine
cortex engines uninstall llamacpp
cortex engines uninstall llama-cpp

## ONNX engine
cortex engines uninstall onnx
cortex engines uninstall onnxruntime

## Tensorrt-LLM engine
cortex engines uninstall tensorrt-llm
Expand Down
4 changes: 2 additions & 2 deletions docs/cli/engines/init.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,10 @@ cortex engines init [options] <name>
For Example:
```bash
## Llama.cpp engine
cortex engines init llamacpp
cortex engines init llama-cpp

## ONNX engine
cortex engines init onnx
cortex engines init onnxruntime

## Tensorrt-LLM engine
cortex engines init tensorrt-llm
Expand Down
6 changes: 3 additions & 3 deletions docs/cli/engines/list.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,9 @@ For example, it returns the following:
+---+---------------+--------------------+---------+--------------+
| # | Name | Supported Formats | Version | Status |
+---+---------------+--------------------+---------+--------------+
| 1 | ONNXRuntime | ONNX | 0.0.1 | Incompatible |
| 2 | llama.cpp | GGUF | 0.0.1 | Ready |
| 3 | TensorRT-LLM | TensorRT Engines | 0.0.1 | Incompatible |
| 1 | onnxruntime | ONNX | 0.0.1 | Incompatible |
| 2 | llama-cpp | GGUF | 0.0.1 | Ready |
| 3 | tensorrt-llm | TensorRT Engines | 0.0.1 | Incompatible |
+---+---------------+--------------------+---------+--------------+
```

Expand Down
51 changes: 0 additions & 51 deletions docs/cli/models/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -323,57 +323,6 @@ This command uses a `model_id` from the model that you have started before.
| `model_id` | The identifier of the model you want to stop. | Yes | - | `mistral` |
| `-h`, `--help` | Display help information for the command. | No | - | `-h` |

## `cortex models update`
:::info
This CLI command calls the following API endpoint:
- [Update Model](/api-reference#tag/models/patch/v1/models/{model})
:::
This command updates a model configuration defined by a `model_id`.



**Usage**:
:::info
You can use the `--verbose` flag to display more detailed output of the internal processes. To apply this flag, use the following format: `cortex --verbose [subcommand]`.
:::
<Tabs>
<TabItem value="MacOs/Linux" label="MacOs/Linux">
```sh
# Stable
cortex models update [options] <model_id>

# Beta
cortex-beta models update [options] <model_id>

# Nightly
cortex-nightly models update [options] <model_id>
```
</TabItem>
<TabItem value="Windows" label="Windows">
```sh
# Stable
cortex.exe models update [options] <model_id>

# Beta
cortex-beta.exe models update [options] <model_id>

# Nightly
cortex-nightly.exe models update [options] <model_id>
```
</TabItem>
</Tabs>

:::info
This command uses a `model_id` from the model that you have downloaded or available in your file system.
:::
**Options**:

| Option | Description | Required | Default value | Example |
|-----------------------------|-------------------------------------------------------------------------------------------------------|----------|----------------------|-----------------------------------------------------------|
| `model_id` | The identifier of the model you want to update. | Yes | - | `mistral` |
| `-c`, `--options <options...>` | Specify the options to update the model. Syntax: `-c option1=value1 option2=value2`. | Yes | - | `-c max_tokens=100 temperature=0.5` |
| `-h`, `--help` | Display help information for the command. | No | - | `-h` |

## `cortex models delete`
:::info
This CLI command calls the following API endpoint:
Expand Down
18 changes: 7 additions & 11 deletions docs/cli/models/list.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,17 +20,13 @@ cortex models list [options]
```
For example, it returns the following:
```bash
┌─────────┬───────────────────────────────────────────────┬──────────────────────────────┬───────────┐
│ (index) │ id │ engine │ version │
├─────────┼───────────────────────────────────────────────┼──────────────────────────────┼───────────┤
│ 0 │ 'gpt-3.5-turbo''openai' │ 1 │
│ 1 │ 'gpt-4o''openai' │ 1 │
│ 2 │ 'llama3:onnx''onnx' │ 1 │
│ 3 │ 'llama3''llamacpp' │ undefined │
│ 4 │ 'openhermes-2.5:tensorrt-llm-windows-ada''tensorrt-llm' │ 1 │
│ 5 │ 'openhermes-2.5:tensorrt-llm''tensorrt-llm' │ 1 │
│ 6 │ 'tinyllama''llamacpp' │ undefined │
└─────────┴───────────────────────────────────────────────┴──────────────────────────────┴───────────┘
+---------+----------------+----------------+-----------------+---------+
| (Index) | ID | model alias | engine | version |
+---------+----------------+----------------+-----------------+---------+
| 1 | llama3:gguf | llama3:gguf | llama-cpp | 1 |
+---------+----------------+----------------+-----------------+---------+
| 2 | tinyllama:gguf | tinyllama:gguf | llama-cpp | 1 |
+---------+----------------+----------------+-----------------+---------+

```

Expand Down
27 changes: 6 additions & 21 deletions docs/cli/ps.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,7 @@ import TabItem from "@theme/TabItem";
:::

# `cortex ps`
:::info
This CLI command calls the following API endpoint:
- [Get Model Status](/api-reference#tag/system/get/v1/system/events/model)
- [Get Resource Status](/api-reference#tag/system/get/v1/system/events/resources)
:::

This command shows the running model and its status.


Expand Down Expand Up @@ -56,22 +52,11 @@ You can use the `--verbose` flag to display more detailed output of the internal
For example, it returns the following table:

```bash
√ Dependencies loaded in 2882ms
√ Getting models...
√ Running PS command...
┌─────────┬──────────────────────┬───────────────────┬───────────┬──────────┬─────┬──────┐
│ (index) │ modelId │ engine │ status │ duration │ ram │ vram │
├─────────┼──────────────────────┼───────────────────┼───────────┼──────────┼─────┼──────┤
│ 0 │ 'janhq/tinyllama/1b''llamacpp''running''7s''-''-'
└─────────┴──────────────────────┴───────────────────┴───────────┴──────────┴─────┴──────┘
√ API server is offline
## The ps command also provides information on the percentage of system resources being used.
┌─────────┬───────────┬──────────────┬────────────────────────┐
│ (index) │ CPU Usage │ Memory Usage │ VRAM │
├─────────┼───────────┼──────────────┼────────────────────────┤
│ 0 │ '4.13%''83.11%' │ │
│ 1 │ │ │ [ [Object], [Object] ] │
└─────────┴───────────┴──────────────┴────────────────────────┘
+----------------+-----------+----------+-----------+-----------+
| Model | Engine | RAM | VRAM | Up time |
+----------------+-----------+----------+-----------+-----------+
| tinyllama:gguf | llama-cpp | 35.16 MB | 601.02 MB | 5 seconds |
+----------------+-----------+----------+-----------+-----------+
```
## Options

Expand Down
8 changes: 3 additions & 5 deletions docs/cli/run.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,12 @@ import TabItem from "@theme/TabItem";
# `cortex run`
:::info
This CLI command calls the following API endpoint:
- [Download Model](/api-reference#tag/models/post/v1/models/{modelId}/pull) (The command only calls this endpoint if the specified model is not downloaded yet.)
- [Download Model](/api-reference#tag/models/post/v1/models/pull) (The command only calls this endpoint if the specified model is not downloaded yet.)
- Install Engine (The command only calls this endpoint if the specified engine is not downloaded yet.)
- [Start Model](/api-reference#tag/models/post/v1/models/{modelId}/start)
- [Chat Completions](/api-reference#tag/inference/post/v1/chat/completions) (The command makes a call to this endpoint if the `-c` option is used.)
- [Start Model](/api-reference#tag/models/post/v1/models/start)
:::

This command facilitates the initiation of an interactive chat shell with a specified machine-learning model.
This command facilitates the initiation of starting a specified machine-learning model.

## Usage
:::info
Expand Down Expand Up @@ -80,4 +79,3 @@ This command downloads and installs the model if not already available in your f
1. [`cortex pull`](/docs/cli/models/): This command pulls the specified model if the model is not yet downloaded.
2. [`cortex engines install`](/docs/cli/engines/): This command installs the specified engines if not yet downloaded.
3. [`cortex models start`](/docs/cli/models/): This command starts the specified model, making it active and ready for interactions.
4. [`cortex chat`](/docs/cli/chat): Following model activation, this command opens an interactive chat shell where users can directly communicate with the model.
2 changes: 1 addition & 1 deletion docs/cli/start.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ You can use the `--verbose` flag to display more detailed output of the internal
| Option | Description | Required | Default value | Example |
| ---------------------------- | ----------------------------------------- | -------- | ------------- | ----------------------------- |
| `-h`, `--help` | Display help information for the command. | No | - | `-h` |
| `-p`, `--port <port>` | Port to serve the application. | No | - | `-p 3928` |
| `-p`, `--port <port>` | Port to serve the application. | No | - | `-p 39281` |
<!-- | `-a`, `--address <address>` | Address to use. | No | - | `-a 192.168.1.1` | -->
<!--| `--dataFolder <dataFolder>` | Set the data folder directory | No | - | `--dataFolder /path/to/data` | -->

0 comments on commit c3570b7

Please sign in to comment.