Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update!: Add local AI chat recommendations #2810

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 15 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
187 changes: 187 additions & 0 deletions docs/ai-chat.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
---
meta_title: "Recommended AI Chat: Private ChatGPT Alternatives - Privacy Guides"
title: "AI Chat"
icon: material/assistant
description: Unlike OpenAI's ChatGPT and its Big Tech competitors, these AI tools do not train their models using your conversations.
cover: ai-chatbots.webp
redoomed1 marked this conversation as resolved.
Show resolved Hide resolved
global:
- [randomize-element, "table tbody"]
---
<small>Protects against the following threat(s):</small>

- [:material-server-network: Service Providers](basics/common-threats.md#privacy-from-service-providers){ .pg-teal }
- [:material-account-cash: Surveillance Capitalism](basics/common-threats.md#surveillance-as-a-business-model){ .pg-brown }
- [:material-close-outline: Censorship](basics/common-threats.md#avoiding-censorship){ .pg-blue-gray }

Since the release of ChatGPT in 2022, interactions with Large Language Models (LLMs) have become increasingly common. LLMs can help us write better, understand unfamiliar subjects, or answer a wide range of questions. They can statistically predict the next word based on a vast amount of data scraped from the web.

## Privacy Concerns about LLMs

Data used to train AI models, however, include a massive amount of _private_ data. Developers of AI software often use [Reinforcement Learning from Human Feedback](https://en.wikipedia.org/wiki/Reinforcement_learning_from_human_feedback) (RLHF) to improve the quality of LLMs, which entails the possibility of AI companies reading your private AI chats as well as storing them. This practice also introduces a risk of data breaches. Furthermore, there is a real possibility that an LLM will leak your private chat information in future conversations with other users.

If you are concerned about these practices, you can either refuse to use AI, or use [truly open-source models](https://proton.me/blog/how-to-build-privacy-first-ai) which publicly release and allow you to inspect their training datasets. One such model is [Olmoe](https://allenai.org/blog/olmoe) made by [Allenai](https://allenai.org/open-data).
redoomed1 marked this conversation as resolved.
Show resolved Hide resolved

Alternatively, you can run AI models locally as a more private and secure alternative to cloud-based solutions, as your data never leaves your device and is therefore never shared with third parties. This also allows you to share sensitive information to the local model without worry.
redoomed1 marked this conversation as resolved.
Show resolved Hide resolved

## Hardware for Local AI Models

Local models are also fairly accessible as it is possible to run smaller models on modest hardware. A computer with at least 8GB of RAM will be sufficient to run smaller models at lower speeds. Using more powerful hardware such as a dedicated GPU with sufficient VRAM or a modern system with fast LPDDR5X memory will offer the best experience.

LLMs can usually be differentiated by the number of parameters, which can vary between 1.3B to 405B. The higher the number of parameters, the higher the LLM's capabilities. For example, models below 6.7B parameters are only good for basic tasks like text summaries, while models between 7B and 13B are a great compromise between quality and speed. Models with advanced reasoning capabilities are generally around 70B.

For consumer-grade hardware, it is generally recommended to use [quantized models](https://huggingface.co/docs/optimum/en/concept_guides/quantization) for the best balance between model quality and performance. Check out the table below for more precise information about the typical requirements for different sizes of quantized models.

| Model Size (in Parameters) | Minimum RAM | Minimum Processor |
|---|---|---|
| 7B | 8GB | Modern CPU (AVX2 support) |
| 13B | 16GB | Modern CPU (AVX2 support) |
| 70B | 72GB | GPU with VRAM |

To run AI locally, you need both an AI model and an AI client.

## AI Models

### Find and Choose a Model

There are many permissively licensed models available to download. **[Hugging Face](https://huggingface.co/models?library=gguf)** is a platform that lets you browse, research, and download models in common formats like GGUF. Companies that provide good open-weights models include big names like Mistral, Meta, Microsoft, and Google. However, there are also many community models and 'fine-tunes' available. As mentioned above, [quantized models](https://huggingface.co/docs/optimum/en/concept_guides/quantization) offer the best balance between model quality and performance for those using consumer-grade hardware.
redoomed1 marked this conversation as resolved.
Show resolved Hide resolved

To help you choose a model that fits your needs, you can look at leaderboards and benchmarks. The most widely-used leaderboard is [LM Arena](https://lmarena.ai/), a "Community-driven Evaluation for Best AI chatbots". There is also the [OpenLLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard), which focus on the performance of open-weights models on common benchmarks like MMLU-PRO. However, there are also specialized benchmarks which measure factors like [emotional intelligence](https://eqbench.com/), ["uncensored general intelligence"](https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard), and many [others](https://www.nebuly.com/blog/llm-leaderboards).
redoomed1 marked this conversation as resolved.
Show resolved Hide resolved

### Model Security

When you have found an AI model of your liking, you should download it in a safe manner. When you use an AI client that maintains their own library of model files (such as [Ollama](#ollama-cli) and [Llamafile](#llamafile)), you should download it from there. However, if you want to download models not present in their library, or use an AI client that doesn't maintain its library (such as [Kobold.cpp](#koboldcpp)), you will need to take extra steps to ensure that the AI model you download is safe and legitimate.
redoomed1 marked this conversation as resolved.
Show resolved Hide resolved

We recommend downloading model files from Hugging Face, as it provides several features to verify that your download is genuine and safe to use.

To check the authenticity and safety of the model, look for:

- Model cards with clear documentation
- A verified organization badge
- Community reviews and usage statistics
- A "Safe" badge next to the model file (Hugging Face only)
- Matching checksums[^1]
- On Hugging Face, you can find the hash by clicking on a model file and looking for the **Copy SHA256** button below it. You should compare this checksum with the one from the model file you downloaded.

A downloaded model is generally safe if it satisfies all of the above checks.

## AI Chat Clients

| Local Client | GPU Support | Image Generation | Speech Recognition | Automatically Downloaded Models | Custom Parameters |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with this format is that it is mobile-unfriendly, as you have to scroll right to see all the info. Could we either revert it back, or change the format when we detect mobile screen size ?

|---|---|---|---|---|---|
| [Kobold.cpp](#koboldcpp) | :material-check:{ .pg-green } | :material-check:{ .pg-green } | :material-check:{ .pg-green } | :material-close:{ .pg-red } | :material-check:{ .pg-green } |
| [Ollama](#ollama-cli) | :material-check:{ .pg-green } | :material-close:{ .pg-red } | :material-close:{ .pg-red } | :material-check:{ .pg-green } | :material-close:{ .pg-red } |
| [Llamafile](#llamafile) | :material-check:{ .pg-green } | :material-close:{ .pg-red } | :material-close:{ .pg-red } | :material-alert-outline:{ .pg-orange } Few models available | :material-alert-outline:{ .pg-orange } |

### Kobold.cpp

<div class="admonition recommendation" markdown>

![Kobold.cpp Logo](assets/img/ai-chat/kobold.png){align=right}

Kobold.cpp is an AI client that runs locally on your Windows, Mac, or Linux computer.

In addition to supporting a large range of text models, Kobold.cpp also supports image generators such as [Stable Diffusion](https://stability.ai/stable-image), and automatic speech recognition tools, such as [Whisper](https://github.com/ggerganov/whisper.cpp).
redoomed1 marked this conversation as resolved.
Show resolved Hide resolved

[:octicons-home-16: Homepage](https://github.com/LostRuins/koboldcpp){ .md-button .md-button--primary }
[:octicons-info-16:](https://github.com/LostRuins/koboldcpp/wiki){ .card-link title="Documentation" }
[:octicons-code-16:](https://github.com/LostRuins/koboldcpp){ .card-link title="Source Code" }
[:octicons-lock-16:](https://github.com/LostRuins/koboldcpp/blob/2f3597c29abea8b6da28f21e714b6b24a5aca79b/SECURITY.md){ .card-link title="Security Policy" }

<details class="downloads" markdown>
<summary>Downloads</summary>

- [:fontawesome-brands-windows: Windows](https://github.com/LostRuins/koboldcpp/releases)
- [:simple-apple: macOS](https://github.com/LostRuins/koboldcpp/releases)
- [:simple-linux: Linux](https://github.com/LostRuins/koboldcpp/releases)

</details>

</div>

redoomed1 marked this conversation as resolved.
Show resolved Hide resolved
<div class="admonition note" markdown>
<p class="admonition-title">Compatibility Issues</p>

Kobold.cpp might not run on computers without AVX/AVX2 support.

</div>

redoomed1 marked this conversation as resolved.
Show resolved Hide resolved
Kobold shines best when you are looking for heavy customization and tweaking, such as for roleplaying purposes. It allows you to modify parameters such as the AI model temperature and the AI chat's system prompt. It also supports creating a network tunnel to access AI models from other devices, such as your phone.

### Ollama (CLI)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could mention that ollama also has GUI community variants. https://github.com/ollama/ollama?tab=readme-ov-file#web--desktop


<div class="admonition recommendation" markdown>

![Ollama Logo](assets/img/ai-chat/ollama.svg){align=right}

Ollama is a command-line AI assistant that is available on macOS, Linux, and Windows. Ollama is a great choice if you're looking for an AI client that's easy-to-use and widely compatible. It also doesn't involve any manual setup, while still using inference and other techniques to make outputs faster.
redoomed1 marked this conversation as resolved.
Show resolved Hide resolved

In addition to supporting a wide range of text models, Ollama also supports [LLaVA](https://github.com/haotian-liu/LLaVA) models and has experimental support for Meta's [Llama vision capabilities](https://huggingface.co/blog/llama32#what-is-llama-32-vision).

[:octicons-home-16: Homepage](https://github.com/ollama/ollama){ .md-button .md-button--primary }
[:octicons-info-16:](https://github.com/ollama/ollama#readme){ .card-link title="Documentation" }
[:octicons-code-16:](https://github.com/ollama/ollama){ .card-link title="Source Code" }

<details class="downloads" markdown>
<summary>Downloads</summary>

- [:fontawesome-brands-windows: Windows](https://ollama.com/download/linux)
- [:simple-apple: macOS](https://ollama.com/download/mac)
- [:simple-linux: Linux](https://ollama.com/download/linux)

</details>

</div>

Ollama simplifies the process of setting up a local AI chat, as it downloads the AI model you want to use automatically. For example, running `ollama run llama3.2` will automatically download and run the Llama 3.2 model. Furthermore, Ollama maintains their own [model library](https://ollama.com/library) where they host the files of various AI models. This ensures models are vetted for both performance and security, eliminating the need to manually verify model authenticity.
redoomed1 marked this conversation as resolved.
Show resolved Hide resolved

### Llamafile

<div class="admonition recommendation" markdown>

![Llamafile Logo](assets/img/ai-chat/llamafile.svg){align=right}

Llamafile is a lightweight single-file executable that allows users to run large language models locally on their own computers without any setup involved. It is [backed by Mozilla](https://hacks.mozilla.org/2023/11/introducing-llamafile) and available on Linux, macOS, and Windows.

Llamafile also supports LLaVA. However, it does not support speech recognition or image generation.

[:octicons-home-16: Homepage](https://github.com/Mozilla-Ocho/llamafile){ .md-button .md-button--primary }
[:octicons-info-16:](https://github.com/Mozilla-Ocho/llamafile/?tab=readme-ov-file#llamafile){ .card-link title="Documentation" }
redoomed1 marked this conversation as resolved.
Show resolved Hide resolved
[:octicons-code-16:](https://github.com/ollama/ollama){ .card-link title="Source Code" }
[:octicons-lock-16:](https://github.com/Mozilla-Ocho/llamafile#security){ .card-link title="Security Policy" }

<details class="downloads" markdown>
jonaharagon marked this conversation as resolved.
Show resolved Hide resolved
<summary>Downloads</summary>

- [:fontawesome-solid-desktop: Desktop](https://github.com/Mozilla-Ocho/llamafile#quickstart)

</details>

</div>

Mozilla has made llamafiles available for only some Llama and Mistral models, while there are few third-party llamafiles available.

If you use Llamafile on Windows, be aware that Windows limits `.exe` files to 4GB, and most models are larger than that. To work around this restriction, you can [load external weights](https://github.com/Mozilla-Ocho/llamafile#using-llamafile-with-external-weights).
redoomed1 marked this conversation as resolved.
Show resolved Hide resolved

## Criteria

Please note we are not affiliated with any of the projects we recommend. In addition to [our standard criteria](about/criteria.md), we have developed a clear set of requirements to allow us to provide objective recommendations. We suggest you familiarize yourself with this list before choosing to use a project and conduct your own research to ensure it's the right choice for you.

### Minimum Requirements

- Must be open-source.
- Must not send personal data, including chat data.
- Must be available on Linux.
- Must not require a GPU.
- Must have support for GPU-powered fast inference.
- Must not require an internet connection.

### Best-Case

Our best-case criteria represent what we *would* like to see from the perfect project in this category. Our recommendations may not include any or all of this functionality, but those which do may rank higher than others on this page.

- Should be multi-platform.
- Should be easy to download and set up, e.g. with a one-click install process.
- Should have a built-in model downloader option.
- The user should be able to modify the LLM parameters, such as its system prompt or temperature.

[^1]: A file checksum is a type of anti-tampering fingerprint. A developer usually provides a checksum in a text file that can be downloaded separately, or on the download page itself. Verifying that the checksum of the file you downloaded matches the one provided by the developer helps ensure that the file is genuine and wasn't tampered with in transit. You can use commands like `sha256sum` on Linux and macOS, or `certutil -hashfile file SHA256` on Windows to generate the downloaded file's checksum.
12 changes: 12 additions & 0 deletions docs/tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -356,6 +356,18 @@ We [recommend](dns.md#recommended-providers) a number of encrypted DNS servers b

## Software

### AI Chat

<div class="grid cards" markdown>

- ![Kobold logo](assets/img/ai-chat/kobold.png){ .twemoji loading=lazy }[Kobold.cpp](ai-chat.md#koboldcpp)
- ![Llamafile logo](assets/img/ai-chat/llamafile.svg){ .twemoji loading=lazy }[Llamafile](ai-chat.md#llamafile)
- ![Ollama logo](assets/img/ai-chat/ollama.svg){ .twemoji loading=lazy }[Ollama](ai-chat.md#ollama-cli)

</div>

[Learn more :material-arrow-right-drop-circle:](ai-chat.md)

### Calendar Sync

<div class="grid cards" markdown>
Expand Down
6 changes: 6 additions & 0 deletions includes/abbreviations.en.txt
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,8 @@
*[ISPs]: Internet Service Providers
*[JNI]: Java Native Interface
*[KYC]: Know Your Customer
*[LLaVA]: Large Language and Vision Assistant (multimodal AI model)
*[LLMs]: Largue Language Models (AI models such as ChatGPT)
redoomed1 marked this conversation as resolved.
Show resolved Hide resolved
*[LUKS]: Linux Unified Key Setup (Full-Disk Encryption)
*[MAC]: Media Access Control
*[MDAG]: Microsoft Defender Application Guard
Expand All @@ -62,6 +64,7 @@
*[OCSP]: Online Certificate Status Protocol
*[OEM]: Original Equipment Manufacturer
*[OEMs]: Original Equipment Manufacturers
*[open-weights]: An open weights-model is an AI model that anyone can download and use, but for which the underlying training data and/or algorithms are proprietary.
*[OS]: Operating System
*[OTP]: One-Time Password
*[OTPs]: One-Time Passwords
Expand All @@ -73,6 +76,7 @@
*[PII]: Personally Identifiable Information
*[QNAME]: Qualified Name
*[QUIC]: A network protocol based on UDP, but aiming to combine the speed of UDP with the reliability of TCP.
*[rate limits]: Rate limits are restrictions that a service imposes on the number of times a user can access their services within a specified period of time.
*[rolling release]: Updates which are released frequently rather than set intervals
*[RSS]: Really Simple Syndication
*[SELinux]: Security-Enhanced Linux
Expand All @@ -86,6 +90,8 @@
*[SaaS]: Software as a Service (cloud software)
*[SoC]: System on Chip
*[SSO]: Single sign-on
*[system prompt]: The system prompt of an AI chat is the general instructions given by a human to guide how it should operate.
*[temperature]: AI temperature is a parameter used in AI models to control the level of randomness and creativity in the generated text.
*[TCP]: Transmission Control Protocol
*[TEE]: Trusted Execution Environment
*[TLS]: Transport Layer Security
Expand Down
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -403,6 +403,7 @@ nav:
- "search-engines.md"
- "vpn.md"
- !ENV [NAV_SOFTWARE, "Software"]:
- "ai-chat.md"
- "calendar.md"
- "cryptocurrency.md"
- "data-redaction.md"
Expand Down
Binary file added theme/assets/img/ai-chat/kobold.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 2 additions & 0 deletions theme/assets/img/ai-chat/llamafile.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 2 additions & 0 deletions theme/assets/img/ai-chat/ollama.svg
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't an actual SVG image, it appears to be a PNG embedded in an SVG file. This is not going to be well compatible with browsers (and is also just bad), so we need to replace it with the official https://ollama.com/public/ollama.png, or find a real vector file, or make our own (which I think @dngray has done before and knows how to do?)

(from #2525 (comment))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we need a version of this that works in dark mode

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't a PNG file is reasonable if there is no official SVG?

Copy link
Contributor

@I-I-IT I-I-IT Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we need a version of this that works in dark mode

Ollama doesn't have any dark mode logo.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A PNG is fine, the problem is that this PNG is embedded in an SVG file which isn't fine. If we replace it with a regular PNG that could work.

We'll probably want to invert this logo and use it for dark mode, which should be fine as it's a simple image.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added theme/assets/img/cover/ai-chatbots.webp
Binary file not shown.