Skip to content

Latest commit

 

History

History
40 lines (33 loc) · 1.54 KB

roadmap.md

File metadata and controls

40 lines (33 loc) · 1.54 KB

V0 Roadmap - Fall'23

  • Draft implementation with CGO llama.cpp backend
  • Simple REST API to allow text generation
  • Inference with Apple Silicon GPU using Metal framework
  • Parallel inference both with CPU and GPU
  • Support both AMD64 and ARM64 platforms
  • CUDA support and fast inference with Nvidia cards
  • Retain dialog history by Session ID parameter
  • Support moderm GGUF V3 model format
  • Inference for most popular LLM architectures
  • Janus Sampling for better non-English text generation

V1 Roadmap - Winter'23

  • Rebrand project: LLaMAZoo => Large Model Collider
  • Is it 2023, 30th of November? First birthday of ChatGPT! Celebrate ...
  • ... then release Collider V1 after half a year of honing it :)

V2 Roadmap - Spring'24

  • Full LLaMA v2 support
  • Freeze JSON / YAML config format for Native API

V3 Roadmap - Summer'24

  • Rebrand project again :) Collider => Booster
  • Complete LLaMA v3 support
  • OpenAI API Chat Completion compatible endpoints
  • Ollama compatible endpoints
  • Interactive mode for chatting from command line
  • Update Janus Sampling for LLaMA-3
  • Broader integration with Ollama ecosystem
  • Smarter context shrinking when reaching its limits chatting with model
  • Embedded web UI with no external dependencies
  • Allow native Windows support
  • Prebuilt binaries for all platforms
  • Support LLaVA multi-modal models inference
  • Better code test coverage
  • Perplexity computation useful for benchmarking