-
Notifications
You must be signed in to change notification settings - Fork 59
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
GitHub Actions
committed
Sep 24, 2024
1 parent
df6c2bd
commit 14a2a19
Showing
25 changed files
with
423 additions
and
95 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
{ | ||
"label": "Models", | ||
"position": 1.3, | ||
"link": { | ||
"type": "doc", | ||
"id": "models_overview" | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
--- | ||
id: benchmark | ||
title: Benchmarks | ||
slug: benchmark | ||
--- | ||
|
||
LLM (Large Language Model) benchmarks are standardized tests or datasets used to evaluate the performance of large language models. These benchmarks help researchers and developers understand the strengths and weaknesses of their models and compare them with other models in a systematic way. | ||
|
||
## Mistral benchmarks | ||
Mistral demonstrates top-tier reasoning capabilities and excels in advanced reasoning, multilingual tasks, math, and code generation. The company reports benchmark results on popular public benchmarks such as MMLU (Massive Multitask Language Understanding), MT-bench, and others. | ||
|
||
You can find the benchmark results in the following blog posts: | ||
- [Pixtral](https://mistral.ai/news/pixtral-12b/): Pixtral 12B the first open-source model to demonstrate state-of-the-art multimodal understanding, without regressing on abilities in pure text. | ||
- [Mistral Large](https://mistral.ai/news/mistral-large-2407/): a cutting-edge text generation model with top-tier reasoning capabilities. | ||
It can be used for complex multilingual reasoning tasks, including text understanding, transformation, and code generation. | ||
- [Mistral Nemo](https://mistral.ai/news/mistral-nemo/): Mistral Nemo's reasoning, world knowledge, and coding performance are state-of-the-art in its size category. As it relies on standard architecture, Mistral Nemo is easy to use and a drop-in replacement in any system using Mistral 7B that it supersedes. | ||
- [Codestral](https://mistral.ai/news/codestral/): as a 22B model, Codestral sets a new standard on the performance/latency space for code generation compared to previous models used for coding. | ||
- [Codestral-Mamba](https://mistral.ai/news/codestral-mamba/): we have trained this model with advanced code and reasoning capabilities, enabling the model to have a strong performance on par with SOTA transformer-based models. | ||
- [Mathstral](https://mistral.ai/news/mathstral/): Mathstral stands on the shoulders of Mistral 7B and specialises in STEM subjects. It achieves state-of-the-art reasoning capacities in its size category across various industry-standard benchmarks. | ||
- [Mixtral 8x22B](https://mistral.ai/news/mixtral-8x22b/): our most performant open model. It handles English, | ||
French, Italian, German, Spanish and performs strongly on code-related tasks. Natively handles function calling. | ||
- [Mixtral 8x7B](https://mistral.ai/news/mixtral-of-experts/): outperforms Llama 2 70B on most benchmarks with 6x faster inference and matches | ||
or outperforms GPT3.5 on most standard benchmarks. It handles English, French, Italian, German and Spanish, and shows strong performance in code generation. | ||
- [Mistral 7B](https://mistral.ai/news/announcing-mistral-7b/): outperforms Llama 2 13B on all benchmarks and Llama 1 34B on many benchmarks. | ||
|
||
## Scale Seal Leaderboard | ||
|
||
[Scale AI](https://scale.com/leaderboard) reports private benchmark results in coding, instruction following, math, and Spanish. Mistral Large performs exceptionally well in code and Spanish, outperforming Llama 3 405B in these areas. | ||
|
||
## Artificial Analysis | ||
|
||
[Artificial Analysis](https://artificialanalysis.ai/models) compares and evaluates AI models across key performance metrics, including quality, price, output speed, latency, context window, and others. Our model has several areas of outstanding performance worth highlighting. | ||
|
||
- Artificial Analysis Quality Index: Our model ranks 3rd in this benchmark, surpassing even the 405B model. This achievement underscores our model's superior ability to analyze and generate high-quality insights. | ||
- Coding (HumanEval): In the HumanEval benchmark, our model secures the 3rd position, again outperforming the 405B model. This highlights our model's exceptional proficiency in coding tasks. | ||
- Quantitative Reasoning (MATH): Our model places 4th in the MATH benchmark, ahead of the 405B model. This demonstrates our model's strong quantitative reasoning capabilities. | ||
- Scientific Reasoning & Knowledge (GPQA): In the GPQA benchmark, our model ranks 4th, showcasing its robust scientific reasoning and knowledge retention abilities. | ||
|
||
## Qualitative feedback | ||
We've gathered a lot of valuable insights from platforms like Reddit and Twitter. Below are some highlights and quotes from users who have shared their experiences with our models. | ||
|
||
### Pixtral: | ||
|
||
> Pixtral absolutely SLAYS at OCR. | ||
> Very impressive at charts and diagrams and drawings and photos of screens. | ||
> It outperforms GPT-4o-mini in many examples I’ve tested. | ||
### Mistral Large: | ||
|
||
> Mistral large 2 has been my go to model. | ||
> This model is so good. In terms of local models, this is probably the first that I honestly felt was proprietary tier for coding. | ||
### Mistral Nemo: | ||
|
||
> I’ve been playing with Nemo for a few days now, and it blows me away at how coherent it is. It’s slightly ‘less creative and more repetitive’ than Llama 3 8B fine-tunes… But it feels ‘more coherent and has better instruction capabilities’. | ||
> Just wanna say thank you to those genius french over at Mistral for Nemo. 12B parameters and 128k context is a very useful combination. It’s enough of a size improvement over 7B to feel a little more “solid” when talking to it, and it runs circles around Llama-2-13B, with 32x the context length. Thank you mistral! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,65 @@ | ||
--- | ||
id: models_overview | ||
title: Models Overview | ||
slug: models_overview | ||
--- | ||
|
||
|
||
Mistral provides two types of models: free models and premier models. | ||
|
||
:::note[ ] | ||
- For API pricing details, please visit our [pricing page](https://mistral.ai/technology/#pricing). | ||
- If you are interested in purchasing a commercial license for our models, please [contact our team](https://mistral.ai/contact/). | ||
::: | ||
|
||
### Premier models | ||
|
||
| Model | Weight availability|Available via API| Description | Max Tokens| API Endpoints|Version| | ||
|--------------------|:--------------------:|:--------------------:|:--------------------:|:--------------------:|:--------------------:|:--------------------:| | ||
| Mistral Large |:heavy_check_mark: <br/> [Mistral Research License](https://mistral.ai/licenses/MRL-0.1.md)| :heavy_check_mark: |Our top-tier reasoning model for high-complexity tasks with the lastest version v2 released July 2024. Learn more on our [blog post](https://mistral.ai/news/mistral-large-2407/)| 128k | `mistral-large-latest`| 24.07| | ||
| Mistral Small | :heavy_check_mark: <br/> [Mistral Research License](https://mistral.ai/licenses/MRL-0.1.md) | :heavy_check_mark: | Our latest enterprise-grade small model with the lastest version v2 released September 2024. Learn more on our [blog post](https://mistral.ai/news/september-24-release/) | 32k | `mistral-small-latest` | 24.09| | ||
| Codestral |:heavy_check_mark: <br/> [Mistral Non-Production License](https://mistral.ai/licenses/MNPL-0.1.md) | :heavy_check_mark: | Our cutting-edge language model for coding released May 2024 | 32k | `codestral-latest` | 24.05| | ||
| Mistral Embed | | :heavy_check_mark: | Our state-of-the-art semantic for extracting representation of text extracts | 8k | `mistral-embed` | 23.12| | ||
|
||
|
||
### Free models | ||
|
||
- **Latest models** | ||
|
||
| Model | Weight availability|Available via API| Description | Max Tokens| API Endpoints|Version| | ||
|--------------------|:--------------------:|:--------------------:|:--------------------:|:--------------------:|:--------------------:|:--------------------:| | ||
| Pixtral | :heavy_check_mark: <br/> Apache2 | :heavy_check_mark: | A 12B model with image understanding capabilities in addition to text. Learn more on our [blog post](https://mistral.ai/news/pixtral-12b/)| 128k | `pixtral-12b-2409` | 24.09| | ||
|
||
- **Research models** | ||
|
||
| Model | Weight availability|Available via API| Description | Max Tokens| API Endpoints|Version| | ||
|--------------------|:--------------------:|:--------------------:|:--------------------:|:--------------------:|:--------------------:|:--------------------:| | ||
| Mistral Nemo | :heavy_check_mark: <br/> Apache2 | :heavy_check_mark: | Our best multilingual open source model released July 2024. Learn more on our [blog post](https://mistral.ai/news/mistral-nemo/) | 128k | `open-mistral-nemo`| 24.07| | ||
| Codestral Mamba | :heavy_check_mark: <br/> Apache2 | :heavy_check_mark: | Our first mamba 2 open source model released July 2024. Learn more on our [blog post](https://mistral.ai/news/codestral-mamba/) | 256k | `open-codestral-mamba`| v0.1| | ||
| Mathstral | :heavy_check_mark: <br/> Apache2 | | Our first math open source model released July 2024. Learn more on our [blog post](https://mistral.ai/news/mathstral/) | 32k | NA| v0.1| | ||
|
||
|
||
- **Legacy models** | ||
|
||
| Model | Weight availability|Available via API| Description | Max Tokens| API Endpoints|Version| | ||
|--------------------|:--------------------:|:--------------------:|:--------------------:|:--------------------:|:--------------------:|:--------------------:| | ||
| Mistral 7B | :heavy_check_mark: <br/> Apache2 |:heavy_check_mark: | Our first dense model released September 2023. Learn more on our [blog post](https://mistral.ai/news/announcing-mistral-7b/)| 32k | `open-mistral-7b`| v0.3| | ||
| Mixtral 8x7B |:heavy_check_mark: <br/> Apache2 | :heavy_check_mark: |Our first sparse mixture-of-experts released December 2023. Learn more on our [blog post](https://mistral.ai/news/mixtral-of-experts/)| 32k | `open-mixtral-8x7b`| v0.1| | ||
| Mixtral 8x22B |:heavy_check_mark: <br/> Apache2 | :heavy_check_mark: | Our best open source model to date released April 2024. Learn more on our [blog post](https://mistral.ai/news/mixtral-8x22b/)| 64k | `open-mixtral-8x22b`| v0.1| | ||
|
||
|
||
## API versioning | ||
|
||
Mistral AI API are versions with specific release dates. | ||
To prevent any disruptions due to model updates and breaking changes, | ||
it is recommended to use the dated versions of the Mistral AI API. | ||
Additionally, be prepared for the deprecation of certain endpoints in the coming months. | ||
|
||
Here are the details of the available versions: | ||
- `open-mistral-nemo`: currently points to `open-mistral-nemo-2407`. | ||
- `mistral-small-latest`: currently points to `mistral-small-2409`. `mistral-small-2402` is deprecated. | ||
- `mistral-medium-latest`: currently points to `mistral-medium-2312`. | ||
The previous `mistral-medium` has been dated and tagged as `mistral-medium-2312`. | ||
Mistral Medium will be deprecated shortly. | ||
- `mistral-large-latest`: currently points to `mistral-large-2407`. `mistral-large-2402` will be deprecated shortly. | ||
- `codestral-latest`: currently points to `codestral-2405`. |
Oops, something went wrong.