Skip to content

Commit

Permalink
docs: readme
Browse files Browse the repository at this point in the history
Signed-off-by: thxCode <[email protected]>
  • Loading branch information
thxCode committed Aug 21, 2024
1 parent 9cee44a commit 65a9ff8
Showing 1 changed file with 9 additions and 9 deletions.
18 changes: 9 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,15 +93,15 @@ $ gguf-parser --path="~/.cache/lm-studio/models/NousResearch/Hermes-2-Pro-Mistra
| llama | 450.50 KiB | 32032 | N/A | 1 | 32000 | N/A | N/A | N/A | N/A | N/A |
+-------+-------------+------------+------------------+-----------+-----------+-----------+-----------+---------------+-----------------+---------------+

+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ESTIMATE |
+-------+--------------+--------------------+-----------------+-----------+----------------+----------------+----------------+------------------------+--------------------+
| ARCH | CONTEXT SIZE | BATCH SIZE (L / P) | FLASH ATTENTION | MMAP LOAD | EMBEDDING ONLY | OFFLOAD LAYERS | FULL OFFLOADED | RAM | VRAM 0 |
| | | | | | | | +-----------+------------+--------+-----------+
| | | | | | | | | UMA | NONUMA | UMA | NONUMA |
+-------+--------------+--------------------+-----------------+-----------+----------------+----------------+----------------+-----------+------------+--------+-----------+
| llama | 32768 | 2048 / 512 | Disabled | Supported | No | 33 (32 + 1) | Yes | 88.25 MiB | 238.25 MiB | 4 GiB | 11.06 GiB |
+-------+--------------+--------------------+-----------------+-----------+----------------+----------------+----------------+-----------+------------+--------+-----------+
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ESTIMATE |
+-------+--------------+--------------------+-----------------+-----------+----------------+----------------+----------------+-------------------------+--------------------+
| ARCH | CONTEXT SIZE | BATCH SIZE (L / P) | FLASH ATTENTION | MMAP LOAD | EMBEDDING ONLY | OFFLOAD LAYERS | FULL OFFLOADED | RAM | VRAM 0 |
| | | | | | | | +------------+------------+--------+-----------+
| | | | | | | | | UMA | NONUMA | UMA | NONUMA |
+-------+--------------+--------------------+-----------------+-----------+----------------+----------------+----------------+------------+------------+--------+-----------+
| llama | 32768 | 2048 / 512 | Disabled | Supported | No | 33 (32 + 1) | Yes | 176.25 MiB | 326.25 MiB | 4 GiB | 11.16 GiB |
+-------+--------------+--------------------+-----------------+-----------+----------------+----------------+----------------+------------+------------+--------+-----------+

$ # Retrieve the model's metadata via split file,
$ # which needs all split files has been downloaded.
Expand Down

0 comments on commit 65a9ff8

Please sign in to comment.