Skip to content

Commit

Permalink
Merge pull request #102 from mistralai/doc/v0.0.55
Browse files Browse the repository at this point in the history
Update docs to v0.0.55
  • Loading branch information
pandora-s-git authored Jul 18, 2024
2 parents 2b9c940 + 02c3f50 commit 6a710ed
Show file tree
Hide file tree
Showing 7 changed files with 172 additions and 37 deletions.
8 changes: 4 additions & 4 deletions docs/capabilities/code-generation.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -236,8 +236,8 @@ curl --location "https://api.mistral.ai/v1/chat/completions" \
</TabItem>
</Tabs>

## Codestral-Mamba
We have also released Codestral-Mamba 7B, a Mamba2 language model specilized in code generation with the instruct endpoint.
## Codestral Mamba
We have also released Codestral Mamba 7B, a Mamba2 language model specilized in code generation with the instruct endpoint.
<Tabs>
<TabItem value="python" label="python" default>
```python
Expand Down Expand Up @@ -278,9 +278,9 @@ curl --location "https://api.mistral.ai/v1/chat/completions" \
</TabItem>
</Tabs>

## Open-weight Codestral and Codestral-Mamba
## Open-weight Codestral and Codestral Mamba
Codestral is available open-weight under the [Mistral AI Non-Production (MNPL) License](https://mistral.ai/licences/MNPL-0.1.md) and
Codestral-Mamba is available open-weight under the Apache 2.0 license.
Codestral Mamba is available open-weight under the Apache 2.0 license.

Check out the README of [mistral-inference](https://github.com/mistralai/mistral-inference) to learn how to use `mistral-inference` to run Codestral.

Expand Down
18 changes: 11 additions & 7 deletions docs/getting-started/Open-weight-models.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,17 +6,18 @@ sidebar_position: 1.4

We open-source both pre-trained models and fine-tuned models. These models are not tuned for safety as we want to empower users to test and refine moderation based on their use cases. For safer models, follow our [guardrailing tutorial](/capabilities/guardrailing).

| Model |Open-weight|API| Description | Max Tokens| Endpoint|
| Model | Available Open-weight|Available via API| Description | Max Tokens| API Endpoints|
|--------------------|:--------------------:|:--------------------:|:--------------------:|:--------------------:|:--------------------:|
| Mistral 7B | :heavy_check_mark: <br/> Apache2 |:heavy_check_mark: |The first dense model released by Mistral AI, perfect for experimentation, customization, and quick iteration. At the time of the release, it matched the capabilities of models up to 30B parameters. Learn more on our [blog post](https://mistral.ai/news/announcing-mistral-7b/)| 32k | `open-mistral-7b`<br/>(aka `mistral-tiny-2312`)|
| Mixtral 8x7B |:heavy_check_mark: <br/> Apache2 | :heavy_check_mark: |A sparse mixture of experts model. As such, it leverages up to 45B parameters but only uses about 12B during inference, leading to better inference throughput at the cost of more vRAM. Learn more on the dedicated [blog post](https://mistral.ai/news/mixtral-of-experts/)| 32k | `open-mixtral-8x7b`<br/>(aka `mistral-small-2312`) |
| Mixtral 8x22B |:heavy_check_mark: <br/> Apache2 | :heavy_check_mark: |A bigger sparse mixture of experts model with larger context window. As such, it leverages up to 141B parameters but only uses about 39B during inference, leading to better inference throughput at the cost of more vRAM. Learn more on the dedicated [blog post](https://mistral.ai/news/mixtral-8x22b/)| 64k | `open-mixtral-8x22b`|
| Mistral 7B | :heavy_check_mark: <br/> Apache2 |:heavy_check_mark: |The first dense model released by Mistral AI, perfect for experimentation, customization, and quick iteration. At the time of the release, it matched the capabilities of models up to 30B parameters. Learn more on our [blog post](https://mistral.ai/news/announcing-mistral-7b/)| 32k | `open-mistral-7b`|
| Mixtral 8x7B |:heavy_check_mark: <br/> Apache2 | :heavy_check_mark: |A sparse mixture of experts model. As such, it leverages up to 45B parameters but only uses about 12B during inference, leading to better inference throughput at the cost of more vRAM. Learn more on the dedicated [blog post](https://mistral.ai/news/mixtral-of-experts/)| 32k | `open-mixtral-8x7b`|
| Mixtral 8x22B |:heavy_check_mark: <br/> Apache2 | :heavy_check_mark: |A bigger sparse mixture of experts model. As such, it leverages up to 141B parameters but only uses about 39B during inference, leading to better inference throughput at the cost of more vRAM. Learn more on the dedicated [blog post](https://mistral.ai/news/mixtral-8x22b/)| 64k | `open-mixtral-8x22b`|
| Codestral |:heavy_check_mark: <br/> MNPL|:heavy_check_mark: | A cutting-edge generative model that has been specifically designed and optimized for code generation tasks, including fill-in-the-middle and code completion | 32k | `codestral-latest`|
| Codestral-Mamba | :heavy_check_mark: | :heavy_check_mark: | A Mamba 2 language model specialized in code generation. Learn more on our [blog post](https://mistral.ai/news/codestral-mamba/) | 256k | `codestral-mamba-latest`|
| Mathstral | :heavy_check_mark: | :heavy_check_mark: | A math-specific 7B model designed for math reasoning and scientific tasks. Learn more on our [blog post](https://mistral.ai/news/mathstral/) | 32k | NA|
| Codestral Mamba | :heavy_check_mark: <br/> Apache2 | :heavy_check_mark: | A Mamba 2 language model specialized in code generation. Learn more on our [blog post](https://mistral.ai/news/codestral-mamba/) | 256k | `open-codestral-mamba`|
| Mathstral | :heavy_check_mark: <br/> Apache2 | | A math-specific 7B model designed for math reasoning and scientific tasks. Learn more on our [blog post](https://mistral.ai/news/mathstral/) | 32k | NA|
| Mistral NeMo | :heavy_check_mark: <br/> Apache2 | :heavy_check_mark: | A 12B model built with the partnership with Nvidia. It is easy to use and a drop-in replacement in any system using Mistral 7B that it supersedes. Learn more on our [blog post](https://mistral.ai/news/mistral-nemo/) | 128k | `open-mistral-nemo`|

## License
- Mistral 7B, Mixtral 8x7B, Mixtral 8x22B, Codestral-Mamba, and Mathstral are under [Apache 2 License](https://choosealicense.com/licenses/apache-2.0/), which permits their use without any constraints.
- Mistral 7B, Mixtral 8x7B, Mixtral 8x22B, Codestral Mamba, Mathstral, and Mistral NeMo are under [Apache 2 License](https://choosealicense.com/licenses/apache-2.0/), which permits their use without any constraints.
- Codestral is under [Mistral AI Non-Production (MNPL) License](https://mistral.ai/licences/MNPL-0.1.md).


Expand All @@ -38,6 +39,8 @@ We open-source both pre-trained models and fine-tuned models. These models are n
| Codestral-22B-v0.1 | [Hugging Face](https://huggingface.co/mistralai/Codestral-22B-v0.1) <br/> [raw_weights](https://models.mistralcdn.com/codestral-22b-v0-1/codestral-22B-v0.1.tar) (md5sum: `1ea95d474a1d374b1d1b20a8e0159de3`) | - 32768 vocabulary size <br/> - Supports v3 Tokenizer |
| Codestral-Mamba-7B-v0.1 | [Hugging Face](https://huggingface.co/mistralai/mamba-codestral-7B-v0.1) <br/> [raw_weights](https://models.mistralcdn.com/codestral-mamba-7b-v0-1/codestral-mamba-7B-v0.1.tar)(md5sum: `d3993e4024d1395910c55db0d11db163`) | - 32768 vocabulary size <br/> - Supports v3 Tokenizer |
| Mathstral-7B-v0.1 | [Hugging Face](https://huggingface.co/mistralai/mathstral-7B-v0.1) <br/> [raw_weights](https://models.mistralcdn.com/mathstral-7b-v0-1/mathstral-7B-v0.1.tar)(md5sum: `5f05443e94489c261462794b1016f10b`) | - 32768 vocabulary size <br/> - Supports v3 Tokenizer |
| Mistral-NeMo-Base-2407 | [Hugging Face](https://huggingface.co/mistralai/Mistral-Nemo-Base-2407) <br/> [raw_weights](https://models.mistralcdn.com/mistral-nemo-2407/mistral-nemo-base-2407.tar)(md5sum: `c5d079ac4b55fc1ae35f51f0a3c0eb83`) | - 131k vocabulary size <br/> - Supports tekken.json tokenizer |
| Mistral-NeMo-Instruct-2407 | [Hugging Face](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) <br/> [raw_weights](https://models.mistralcdn.com/mistral-nemo-2407/mistral-nemo-instruct-2407.tar)(md5sum: `296fbdf911cb88e6f0be74cd04827fe7`) | - 131k vocabulary size <br/> - Supports tekken.json tokenizer <br/> - Supports function calling |


## Sizes
Expand All @@ -50,6 +53,7 @@ We open-source both pre-trained models and fine-tuned models. These models are n
| Codestral-22B-v0.1 | 22.2B | 22.2B | 60 |
| Codestral-Mamba-7B-v0.1 | 7.3B | 7.3B | 16 |
| Mathstral-7B-v0.1 | 7.3B | 7.3B | 16 |
| Mistral-NeMo-12B-v0.1 | 12B | 12B | 28 - bf16 <br/> 16 - fp8 |

## How to run?
Check out [mistral-inference](https://github.com/mistralai/mistral-inference/), a Python package for running our models. You can install `mistral-inference` by
Expand Down
5 changes: 4 additions & 1 deletion docs/getting-started/changelog.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,11 @@ sidebar_position: 1.8

This is the list of changes to the Mistral API.

July 18, 2024
- We released Mistral NeMo (`open-mistral-nemo`).

July 16, 2024
- We released Codestral-Mamba and Mathstral.
- We released Codestral Mamba (`open-codestral-mamba`) and Mathstral.

Jun 5, 2024
- We released fine-tuning API. Check out the [capability docs](/capabilities/finetuning/) and [guides](/guides/finetuning/).
Expand Down
3 changes: 3 additions & 0 deletions docs/getting-started/introduction.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ We release both open source and commercial models, driving innovation and conven
- Mistral 7b, our first dense model released [September 2023](https://mistral.ai/news/announcing-mistral-7b/)
- Mixtral 8x7b, our first sparse mixture-of-experts released [December 2023](https://mistral.ai/news/mixtral-of-experts/)
- Mixtral 8x22b, our best open source model to date released [April 2024](https://mistral.ai/news/mixtral-8x22b/)
- Mathstral 7b, our first math open source model released [July 2024](https://mistral.ai/news/mathstral/)
- Codestral Mamba 7b, our first mamba 2 open source model released [July 2024](https://mistral.ai/news/codestral-mamba/)
- Mistral NeMo 7b, our best multilingual open source model released [July 2024](https://mistral.ai/news/mistral-nemo/)

### Commercial

Expand Down
13 changes: 5 additions & 8 deletions docs/getting-started/models.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,9 @@ They are ideal for customization, such as fine-tuning, due to their portability,
| Mistral Large || :heavy_check_mark: |Our flagship model that's ideal for complex tasks that require large reasoning capabilities or are highly specialized (Synthetic Text Generation, Code Generation, RAG, or Agents). Learn more on our [blog post](https://mistral.ai/news/mistral-large/)| 32k | `mistral-large-latest`|
| Mistral Embeddings ||:heavy_check_mark: | A model that converts text into numerical vectors of embeddings in 1024 dimensions. Embedding models enable retrieval and retrieval-augmented generation applications. It achieves a retrieval score of 55.26 on MTEB | 8k | `mistral-embed`|
| Codestral |:heavy_check_mark: <br/> MNPL|:heavy_check_mark: | A cutting-edge generative model that has been specifically designed and optimized for code generation tasks, including fill-in-the-middle and code completion | 32k | `codestral-latest`|
| Codestral-Mamba | :heavy_check_mark: | :heavy_check_mark: | A Mamba 2 language model specialized in code generation. Learn more on our [blog post](https://mistral.ai/news/codestral-mamba/) | 256k | `codestral-mamba-latest`|
| Mathstral | :heavy_check_mark: | :heavy_check_mark: | A math-specific 7B model designed for math reasoning and scientific tasks. Learn more on our [blog post](https://mistral.ai/news/mathstral/) | 32k | NA|
| Codestral Mamba | :heavy_check_mark: <br/> Apache2 | :heavy_check_mark: | A Mamba 2 language model specialized in code generation. Learn more on our [blog post](https://mistral.ai/news/codestral-mamba/) | 256k | `open-codestral-mamba`|
| Mathstral | :heavy_check_mark: <br/> Apache2 | | A math-specific 7B model designed for math reasoning and scientific tasks. Learn more on our [blog post](https://mistral.ai/news/mathstral/) | 32k | NA|
| Mistral NeMo | :heavy_check_mark: <br/> Apache2 | :heavy_check_mark: | A 12B model built with the partnership with Nvidia. It is easy to use and a drop-in replacement in any system using Mistral 7B that it supersedes. Learn more on our [blog post](https://mistral.ai/news/mistral-nemo/) | 128k | `open-mistral-nemo`|

## Pricing

Expand All @@ -36,18 +37,12 @@ it is recommended to use the dated versions of the Mistral AI API.
Additionally, be prepared for the deprecation of certain endpoints in the coming months.

Here are the details of the available versions:
- `open-mistral-7b`: currently points to `mistral-tiny-2312`.
It used to be called `mistral-tiny`, which will be deprecated shortly.
- `open-mixtral-8x7b`: currently points to `mistral-small-2312`.
It used to be called `mistral-small`, which will be deprecated shortly.
- `open-mixtral-8x22b` points to `open-mixtral-8x22b-2404`.
- `mistral-small-latest`: currently points to `mistral-small-2402`.
- `mistral-medium-latest`: currently points to `mistral-medium-2312`.
The previous `mistral-medium` has been dated and tagged as `mistral-medium-2312`.
Mistral Medium will be deprecated shortly.
- `mistral-large-latest`: currently points to `mistral-large-2402`.
- `codestral-latest`: currently points to `codestral-2405`.
- `codestral-mamba-latest`: currently points to `codestral-mamba-2407`.

## Benchmarks results
Mistral ranks second among all models generally available through an API.
Expand All @@ -64,6 +59,8 @@ It can be used for complex multilingual reasoning tasks, including text understa
- [Codestral](https://mistral.ai/news/codestral/): as a 22B model, Codestral sets a new standard on the performance/latency space for code generation compared to previous models used for coding.
- [Codestral-Mamba](https://mistral.ai/news/codestral-mamba/): we have trained this model with advanced code and reasoning capabilities, enabling the model to have a strong performance on par with SOTA transformer-based models.
- [Mathstral](https://mistral.ai/news/mathstral/): Mathstral stands on the shoulders of Mistral 7B and specialises in STEM subjects. It achieves state-of-the-art reasoning capacities in its size category across various industry-standard benchmarks.
- [Mistral NeMo](https://mistral.ai/news/mistral-nemo/): Mistral NeMo's reasoning, world knowledge, and coding performance are state-of-the-art in its size category. As it relies on standard architecture, Mistral NeMo is easy to use and a drop-in replacement in any system using Mistral 7B that it supersedes.


## Picking a model

Expand Down
Loading

0 comments on commit 6a710ed

Please sign in to comment.