- Draft implementation with CGO llama.cpp backend
- Simple REST API to allow text generation
- Inference with Apple Silicon GPU using Metal framework
- Parallel inference both with CPU and GPU
- Support both AMD64 and ARM64 platforms
- CUDA support and fast inference with Nvidia cards
- Retain dialog history by Session ID parameter
- Support moderm GGUF V3 model format
- Inference for most popular LLM architectures
- Janus Sampling for better non-English text generation
- Rebrand project: LLaMAZoo => Large Model Collider
- Is it 2023, 30th of November? First birthday of ChatGPT! Celebrate ...
- ... then release Collider V1 after half a year of honing it :)
- Full LLaMA v2 support
- Freeze JSON / YAML config format for Native API
- Rebrand project again :) Collider => Booster
- Complete LLaMA v3 support
- OpenAI API Chat Completion compatible endpoints
- Ollama compatible endpoints
- Interactive mode for chatting from command line
- Update Janus Sampling for LLaMA-3
- Broader integration with Ollama ecosystem
- Smarter context shrinking when reaching its limits chatting with model
- Embedded web UI with no external dependencies
- Allow native Windows support
- Prebuilt binaries for all platforms
- Support LLaVA multi-modal models inference
- Better code test coverage
- Perplexity computation useful for benchmarking