From 947aeb60e7bb4d71a339c82f52501011da7c81a1 Mon Sep 17 00:00:00 2001 From: GitHub Actions Date: Thu, 18 Jul 2024 16:23:43 +0000 Subject: [PATCH] Update version to v0.0.59 --- docs/getting-started/Open-weight-models.mdx | 9 ++++++++- docs/getting-started/changelog.mdx | 2 +- docs/getting-started/models.mdx | 4 ++-- version.txt | 2 +- 4 files changed, 12 insertions(+), 5 deletions(-) diff --git a/docs/getting-started/Open-weight-models.mdx b/docs/getting-started/Open-weight-models.mdx index 5723123..ddfce7c 100644 --- a/docs/getting-started/Open-weight-models.mdx +++ b/docs/getting-started/Open-weight-models.mdx @@ -6,15 +6,19 @@ sidebar_position: 1.4 We open-source both pre-trained models and fine-tuned models. These models are not tuned for safety as we want to empower users to test and refine moderation based on their use cases. For safer models, follow our [guardrailing tutorial](/capabilities/guardrailing). +| Model | Available Open-weight|Available via API| Description | Max Tokens| API Endpoints| | Model | Available Open-weight|Available via API| Description | Max Tokens| API Endpoints| |--------------------|:--------------------:|:--------------------:|:--------------------:|:--------------------:|:--------------------:| | Mistral 7B | :heavy_check_mark:
Apache2 |:heavy_check_mark: |The first dense model released by Mistral AI, perfect for experimentation, customization, and quick iteration. At the time of the release, it matched the capabilities of models up to 30B parameters. Learn more on our [blog post](https://mistral.ai/news/announcing-mistral-7b/)| 32k | `open-mistral-7b`| | Mixtral 8x7B |:heavy_check_mark:
Apache2 | :heavy_check_mark: |A sparse mixture of experts model. As such, it leverages up to 45B parameters but only uses about 12B during inference, leading to better inference throughput at the cost of more vRAM. Learn more on the dedicated [blog post](https://mistral.ai/news/mixtral-of-experts/)| 32k | `open-mixtral-8x7b`| | Mixtral 8x22B |:heavy_check_mark:
Apache2 | :heavy_check_mark: |A bigger sparse mixture of experts model. As such, it leverages up to 141B parameters but only uses about 39B during inference, leading to better inference throughput at the cost of more vRAM. Learn more on the dedicated [blog post](https://mistral.ai/news/mixtral-8x22b/)| 64k | `open-mixtral-8x22b`| +| Mistral 7B | :heavy_check_mark:
Apache2 |:heavy_check_mark: |The first dense model released by Mistral AI, perfect for experimentation, customization, and quick iteration. At the time of the release, it matched the capabilities of models up to 30B parameters. Learn more on our [blog post](https://mistral.ai/news/announcing-mistral-7b/)| 32k | `open-mistral-7b`| +| Mixtral 8x7B |:heavy_check_mark:
Apache2 | :heavy_check_mark: |A sparse mixture of experts model. As such, it leverages up to 45B parameters but only uses about 12B during inference, leading to better inference throughput at the cost of more vRAM. Learn more on the dedicated [blog post](https://mistral.ai/news/mixtral-of-experts/)| 32k | `open-mixtral-8x7b`| +| Mixtral 8x22B |:heavy_check_mark:
Apache2 | :heavy_check_mark: |A bigger sparse mixture of experts model. As such, it leverages up to 141B parameters but only uses about 39B during inference, leading to better inference throughput at the cost of more vRAM. Learn more on the dedicated [blog post](https://mistral.ai/news/mixtral-8x22b/)| 64k | `open-mixtral-8x22b`| | Codestral |:heavy_check_mark:
MNPL|:heavy_check_mark: | A cutting-edge generative model that has been specifically designed and optimized for code generation tasks, including fill-in-the-middle and code completion | 32k | `codestral-latest`| | Codestral Mamba | :heavy_check_mark:
Apache2 | :heavy_check_mark: | A Mamba 2 language model specialized in code generation. Learn more on our [blog post](https://mistral.ai/news/codestral-mamba/) | 256k | `open-codestral-mamba`| | Mathstral | :heavy_check_mark:
Apache2 | | A math-specific 7B model designed for math reasoning and scientific tasks. Learn more on our [blog post](https://mistral.ai/news/mathstral/) | 32k | NA| -| Mistral NeMo | :heavy_check_mark:
Apache2 | :heavy_check_mark: | A 12B model built with the partnership with Nvidia. It is easy to use and a drop-in replacement in any system using Mistral 7B that it supersedes. Learn more on our [blog post](https://mistral.ai/news/mistral-nemo/) | 128k | `open-mistral-nemo`| +| Mistral NeMo | :heavy_check_mark:
Apache2 | :heavy_check_mark: | A 12B model built with the partnership with Nvidia. It is easy to use and a drop-in replacement in any system using Mistral 7B that it supersedes. Learn more on our [blog post](https://mistral.ai/news/mistral-nemo/) | 128k | `open-mistral-nemo-latest`| ## License - Mistral 7B, Mixtral 8x7B, Mixtral 8x22B, Codestral Mamba, Mathstral, and Mistral NeMo are under [Apache 2 License](https://choosealicense.com/licenses/apache-2.0/), which permits their use without any constraints. @@ -41,6 +45,8 @@ We open-source both pre-trained models and fine-tuned models. These models are n | Mathstral-7B-v0.1 | [Hugging Face](https://huggingface.co/mistralai/mathstral-7B-v0.1)
[raw_weights](https://models.mistralcdn.com/mathstral-7b-v0-1/mathstral-7B-v0.1.tar)(md5sum: `5f05443e94489c261462794b1016f10b`) | - 32768 vocabulary size
- Supports v3 Tokenizer | | Mistral-NeMo-Base-2407 | [Hugging Face](https://huggingface.co/mistralai/Mistral-Nemo-Base-2407)
[raw_weights](https://models.mistralcdn.com/mistral-nemo-2407/mistral-nemo-base-2407.tar)(md5sum: `c5d079ac4b55fc1ae35f51f0a3c0eb83`) | - 131k vocabulary size
- Supports tekken.json tokenizer | | Mistral-NeMo-Instruct-2407 | [Hugging Face](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407)
[raw_weights](https://models.mistralcdn.com/mistral-nemo-2407/mistral-nemo-instruct-2407.tar)(md5sum: `296fbdf911cb88e6f0be74cd04827fe7`) | - 131k vocabulary size
- Supports tekken.json tokenizer
- Supports function calling | +| Mistral-NeMo-Base-2407 | [Hugging Face](https://huggingface.co/mistralai/Mistral-Nemo-Base-2407)
[raw_weights](https://models.mistralcdn.com/mistral-nemo-2407/mistral-nemo-base-2407.tar)(md5sum: `c5d079ac4b55fc1ae35f51f0a3c0eb83`) | - 131k vocabulary size
- Supports tekken.json tokenizer | +| Mistral-NeMo-Instruct-2407 | [Hugging Face](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407)
[raw_weights](https://models.mistralcdn.com/mistral-nemo-2407/mistral-nemo-instruct-2407.tar)(md5sum: `296fbdf911cb88e6f0be74cd04827fe7`) | - 131k vocabulary size
- Supports tekken.json tokenizer
- Supports function calling | ## Sizes @@ -54,6 +60,7 @@ We open-source both pre-trained models and fine-tuned models. These models are n | Codestral-Mamba-7B-v0.1 | 7.3B | 7.3B | 16 | | Mathstral-7B-v0.1 | 7.3B | 7.3B | 16 | | Mistral-NeMo-12B-v0.1 | 12B | 12B | 28 - bf16
16 - fp8 | +| Mistral-NeMo-12B-v0.1 | 12B | 12B | 28 - bf16
16 - fp8 | ## How to run? Check out [mistral-inference](https://github.com/mistralai/mistral-inference/), a Python package for running our models. You can install `mistral-inference` by diff --git a/docs/getting-started/changelog.mdx b/docs/getting-started/changelog.mdx index 30c14ee..eac7b38 100644 --- a/docs/getting-started/changelog.mdx +++ b/docs/getting-started/changelog.mdx @@ -7,7 +7,7 @@ sidebar_position: 1.8 This is the list of changes to the Mistral API. July 18, 2024 -- We released Mistral NeMo (`open-mistral-nemo`). +- We released Mistral NeMo (`open-mistral-nemo-latest`). July 16, 2024 - We released Codestral Mamba (`open-codestral-mamba`) and Mathstral. diff --git a/docs/getting-started/models.mdx b/docs/getting-started/models.mdx index f46d383..d4d43ef 100644 --- a/docs/getting-started/models.mdx +++ b/docs/getting-started/models.mdx @@ -23,7 +23,7 @@ They are ideal for customization, such as fine-tuning, due to their portability, | Codestral |:heavy_check_mark:
MNPL|:heavy_check_mark: | A cutting-edge generative model that has been specifically designed and optimized for code generation tasks, including fill-in-the-middle and code completion | 32k | `codestral-latest`| | Codestral Mamba | :heavy_check_mark:
Apache2 | :heavy_check_mark: | A Mamba 2 language model specialized in code generation. Learn more on our [blog post](https://mistral.ai/news/codestral-mamba/) | 256k | `open-codestral-mamba`| | Mathstral | :heavy_check_mark:
Apache2 | | A math-specific 7B model designed for math reasoning and scientific tasks. Learn more on our [blog post](https://mistral.ai/news/mathstral/) | 32k | NA| -| Mistral NeMo | :heavy_check_mark:
Apache2 | :heavy_check_mark: | A 12B model built with the partnership with Nvidia. It is easy to use and a drop-in replacement in any system using Mistral 7B that it supersedes. Learn more on our [blog post](https://mistral.ai/news/mistral-nemo/) | 128k | `open-mistral-nemo`| +| Mistral NeMo | :heavy_check_mark:
Apache2 | :heavy_check_mark: | A 12B model built with the partnership with Nvidia. It is easy to use and a drop-in replacement in any system using Mistral 7B that it supersedes. Learn more on our [blog post](https://mistral.ai/news/mistral-nemo/) | 128k | `open-mistral-nemo-latest`| ## Pricing @@ -37,6 +37,7 @@ it is recommended to use the dated versions of the Mistral AI API. Additionally, be prepared for the deprecation of certain endpoints in the coming months. Here are the details of the available versions: +- `open-mistral-nemo-latest`: currently points to `open-mistral-nemo-2407`. - `mistral-small-latest`: currently points to `mistral-small-2402`. - `mistral-medium-latest`: currently points to `mistral-medium-2312`. The previous `mistral-medium` has been dated and tagged as `mistral-medium-2312`. @@ -61,7 +62,6 @@ It can be used for complex multilingual reasoning tasks, including text understa - [Mathstral](https://mistral.ai/news/mathstral/): Mathstral stands on the shoulders of Mistral 7B and specialises in STEM subjects. It achieves state-of-the-art reasoning capacities in its size category across various industry-standard benchmarks. - [Mistral NeMo](https://mistral.ai/news/mistral-nemo/): Mistral NeMo's reasoning, world knowledge, and coding performance are state-of-the-art in its size category. As it relies on standard architecture, Mistral NeMo is easy to use and a drop-in replacement in any system using Mistral 7B that it supersedes. - ## Picking a model This guide will explore the performance and cost trade-offs, and discuss how to select the appropriate model for different use cases. We will delve into various factors to consider, offering guidance on choosing the right model for your specific needs. diff --git a/version.txt b/version.txt index 75163b6..c8fe2be 100644 --- a/version.txt +++ b/version.txt @@ -1 +1 @@ -v0.0.59 +v0.0.15