Skip to content

Node0/crystallizer

Repository files navigation

Crystallizer

A Map -> Reduce powerhouse, disguised as an insight summarization tool.

Crystallizer is a programmable, LLM-powered general purpose data traversal and transformation tool.

Its default use-case will be as insight extraction and cohesion across N parts of long documents (think books).

However it can be programmed to do a large number of open-ended tasks, owing to it's templated system and task prompt design.

Crystallizer-Web

Installation

📋 Complete Installation Guide - Choose your preferred tool

Quick Links by Tool:

Quick Start (pip)

Requirements: Python 3.11+

# Create virtual environment
python3.11 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Test installation
python crystallizer.py --help

Usage

python crystallizer.py \
  --system-prompt system_prompt.j2 \
  --haystack-path ./chat_logs \
  --provider ollama \
  --task-label gluon_design \
  --output-dir ./crystals

Configuration

Configure your LLM providers in config.json:

{
  "providers": {
    "ollama": {
      "host": "localhost",
      "port": 11434,
      "model": "qwen2.5-coder:32b",
      "context_length": 18000
    },
    "openai": {
      "api_key": "your-api-key",
      "model": "gpt-4o-mini",
      "context_length": 128000
    }
  }
}

Features

  • Token-Aware Windowing: Automatically chunks large documents to fit LLM context limits
  • Multi-Provider Support: Works with Ollama (local) and OpenAI (cloud) backends
  • Template-Driven Prompts: Jinja2 templates for custom system prompts
  • Hierarchical Processing: 3-segment micro-windowing with merge strategies
  • Professional Logging: Semantic progress tracking with contextual semaphores
  • Batch Processing: Handle single files or entire directories

License

GNU GPL v3

About

A Power Tool dressed up as summarization assistant

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published