A powerful multi-agent platform with LLM integration, designed for building sophisticated AI-powered applications with dynamic agent management and real-time inference capabilities.
- π Quick Start
- π Features
- ποΈ Architecture
- π οΈ Installation
- βοΈ Configuration
- π API Reference
- π§ͺ Testing
- π Integration
- π Performance
- π€ Contributing
- π License
Get the Kolosal Agent System up and running in minutes:
- C++20 compatible compiler (GCC 9+, Clang 10+, or Visual Studio 2019+)
- CMake 3.14 or higher
- Git with submodule support
- 4GB RAM minimum (8GB+ recommended)
# Clone the repository
git clone --recursive https://github.com/KolosalAI/kolosal-agent.git
cd kolosal-agent
# Build the application
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Debug
cmake --build . --config Debug
# Run the system
./kolosal-agent # Linux/macOS (uses config.yaml by default)
.\Debug\kolosal-agent.exe # Windows (uses config.yaml by default)
# Run with custom configuration files
docker run -d \
--name kolosal-agent \
-p 8081:8081 \
-v $(pwd)/config:/app/config \
kolosal-agent:latest
# Test the API
curl http://localhost:8081/status
# List default agents
curl http://localhost:8081/agents
# Simple execute (recommended) - runs all tools automatically
curl -X POST http://localhost:8081/agent/execute \
-H "Content-Type: application/json" \
-d '{"query": "What is artificial intelligence?", "context": "Explain for beginners"}'
# Chat with a specific agent
curl -X POST http://localhost:8081/agents/Assistant/execute \
-H "Content-Type: application/json" \
-d '{"function": "chat", "params": {"message": "Hello!", "model": "qwen2.5-0.5b"}}'
# Execute a workflow
curl -X POST http://localhost:8081/workflows/simple_research/execute \
-H "Content-Type: application/json" \
-d '{"input_data": {"query": "AI trends"}}'
# List available workflows
curl http://localhost:8081/workflows
- Dynamic Agent Creation - Create specialized agents with custom capabilities
- Lifecycle Management - Start, stop, and manage multiple agents simultaneously
- Built-in Functions - Chat, analysis, research, and custom function execution
- System Prompts - Configure agents with specialized instructions
- Multiple Execution Patterns - Sequential, parallel, conditional, loop, and pipeline workflows
- Workflow Templates - Pre-built workflows for research, analysis, decision-making, and data processing
- Custom Workflows - Design and register your own multi-step workflows via YAML or API
- Agent-LLM Pairing - Configure specific LLM models for each agent in workflows
- Execution Control - Pause, resume, and cancel long-running workflows in real-time
- Pipeline Processing - Chain agent outputs as inputs to subsequent steps
- Progress Monitoring - Real-time execution tracking and detailed progress reporting
- Conditional Logic - Dynamic step execution based on previous results
- HTTP REST API - OpenAPI-compatible endpoints for all operations
- Real-time Communication - Instant agent responses and status updates
- Configuration Management - Hot-reloadable YAML configurations
- Health Monitoring - System status and agent health tracking
- Kolosal Server Integration - Built-in integration with high-performance inference server
- Model Interface - Flexible model communication and parameter handling
- Retrieval Augmented Generation - Document retrieval and web search capabilities
- Multi-Model Support - Support for various LLM models and embedding models
- Simple Configuration - YAML-based configuration files
- Cross-Platform - Windows, Linux, and macOS support
- Comprehensive Testing - Unit, integration, and performance tests
- Extensive Documentation - Complete API documentation and examples
The Kolosal Agent System features a layered architecture designed for scalability and maintainability:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π REST API Layer β
β βββββββββββββββββββ¬ββββββββββββββββββ¬ββββββββββββββββββ β
β β Agent APIs β Workflow APIs β System APIs β β
β βββββββββββββββββββ΄ββββββββββββββββββ΄ββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β π Workflow Orchestration Layer β
β βββββββββββββββββββ¬ββββββββββββββββββ¬ββββββββββββββββββ β
β βWorkflow Manager βWorkflow Builder βTemplate Engine β β
β β & Executor β & Orchestrator β & Config Loader β β
β βββββββββββββββββββ΄ββββββββββββββββββ΄ββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β π€ Multi-Agent System β
β βββββββββββββββββββ¬ββββββββββββββββββ¬βββββββββββββββββββ β
β β Agent Manager β Message Router β Function Registryβ β
β βββββββββββββββββββ΄ββββββββββββββββββ΄βββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β π§ Core Components β
β βββββββββββββββββββ¬ββββββββββββββββββ¬ββββββββββββββββββ β
β β Agent Core β Config Manager β HTTP Server β β
β βββββββββββββββββββ΄ββββββββββββββββββ΄ββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β‘ Inference & Data Layer β
β βββββββββββββββββββ¬ββββββββββββββββββ¬ββββββββββββββββββ β
β β Kolosal Server β Model Interface βRetrieval Managerβ β
β βββββββββββββββββββ΄ββββββββββββββββββ΄ββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- π Workflow System - Advanced workflow orchestration with multiple execution patterns
- π€ Agent System - Manages agent lifecycle, creation, and function execution
- π HTTP Server - REST API with concurrent request processing and workflow endpoints
- β‘ Kolosal Server Integration - Optional high-performance LLM inference
- π Configuration Management - Hot-reloadable YAML configurations for agents and workflows
- Multiple Execution Types - Sequential, parallel, conditional, loop, and pipeline workflows
- Agent-LLM Pairing - Configurable agent-model mappings for optimal performance
- Built-in Templates - Pre-defined workflow templates for common patterns
- Dynamic Execution - Real-time workflow execution control (pause, resume, cancel)
- Progress Monitoring - Detailed execution tracking and progress reporting
- C++20 compatible compiler
- CMake 3.14+
- Git with submodule support
- 4GB+ RAM
πͺ Windows (PowerShell)
### Clone Repository
```bash
# Clone the repository
git clone --recursive https://github.com/KolosalAI/kolosal-agent.git
cd kolosal-agent
# Create build directory
New-Item -ItemType Directory -Path "build" -Force
Set-Location "build"
# Configure and build
cmake .. -G "Visual Studio 17 2022" -A x64 -DCMAKE_BUILD_TYPE=Debug
cmake --build . --config Debug --parallel
# Run
.\Debug\kolosal-agent.exe
π§ Linux (Ubuntu/Debian)
# Install dependencies
sudo apt update
sudo apt install build-essential cmake git libcurl4-openssl-dev
# Clone and build
git clone --recursive https://github.com/KolosalAI/kolosal-agent.git
cd kolosal-agent && mkdir build && cd build
# Configure and build
cmake .. -DCMAKE_BUILD_TYPE=Debug
make -j$(nproc)
# Run
./kolosal-agent
π macOS
# Install dependencies
brew install cmake git curl yaml-cpp
# Clone and build
git clone --recursive https://github.com/KolosalAI/kolosal-agent.git
cd kolosal-agent && mkdir build && cd build
# Configure and build
cmake .. -DCMAKE_BUILD_TYPE=Debug
make -j$(sysctl -n hw.ncpu)
# Run
./kolosal-agent
# Standard build
cmake .. -DCMAKE_BUILD_TYPE=Debug
# With tests
cmake .. -DCMAKE_BUILD_TYPE=Debug -DBUILD_TESTS=ON
# With kolosal-server integration
cmake .. -DCMAKE_BUILD_TYPE=Debug -DBUILD_KOLOSAL_SERVER=ON
# Check build outputs
ls build/Debug/ # Windows
ls build/ # Linux/macOS
# Test the application
curl http://localhost:8081/status
The system uses YAML configuration files with environment variable support for security and portability.
-
Copy the environment template:
cp .env.template .env
-
Edit
.env
with your values:# API Keys (keep these secret!) KOLOSAL_API_KEY=your-api-key-here KOLOSAL_SEARCH_API_KEY=your-search-api-key KOLOSAL_QDRANT_API_KEY=your-qdrant-api-key # Paths (relative to project root) KOLOSAL_MODEL_PATH=./models KOLOSAL_ENGINE_PATH=./build/Debug # Windows # KOLOSAL_ENGINE_PATH=./build # Linux/macOS # Network settings KOLOSAL_HOST=127.0.0.1 KOLOSAL_PORT=8081
-
Load environment and start:
# Linux/macOS source .env && ./kolosal-agent # Windows PowerShell Get-Content .env | ForEach-Object { if ($_ -match '^([^=]+)=(.*)$') { [Environment]::SetEnvironmentVariable($matches[1], $matches[2], 'Process') } } .\Debug\kolosal-agent.exe
The system supports three main configuration approaches:
- Environment Variables (Recommended) - See Environment Variables Guide
- Direct YAML Configuration - Traditional approach
- Hybrid - YAML with environment variable substitution
system:
name: "Kolosal Agent System"
version: "1.0.0"
host: ${KOLOSAL_HOST:-127.0.0.1}
port: ${KOLOSAL_PORT:-8081}
log_level: ${KOLOSAL_LOG_LEVEL:-info}
# API key for authentication (when enabled)
security:
api_key: ${KOLOSAL_API_KEY:-}
agents:
- name: "Assistant"
capabilities: ["chat", "analysis", "reasoning"]
auto_start: true
- name: "RetrievalAgent"
capabilities: ["retrieval", "document_management", "semantic_search"]
auto_start: true
functions:
chat:
description: "Interactive chat functionality"
parameters:
- name: "message"
type: "string"
required: true
- name: "model"
type: "string"
required: true
Optional - for LLM integration
server:
port: ${KOLOSAL_PORT:-8081}
host: ${KOLOSAL_HOST:-127.0.0.1}
idle_timeout: 300
# Models with environment variable paths
models:
- id: qwen2.5-0.5b
path: ${KOLOSAL_MODEL_PATH:-./models}/qwen2.5-0.5b-instruct-q4_k_m.gguf
type: llm
load_immediately: true
- id: all-MiniLM-L6-v2-bf16-q4_k
path: ${KOLOSAL_MODEL_PATH:-./models}/all-MiniLM-L6-v2-bf16-q4_k.gguf
type: embedding
load_immediately: true
# Database configuration with API key support
database:
vector_database: ${KOLOSAL_VECTOR_DB:-qdrant}
qdrant:
host: ${KOLOSAL_QDRANT_HOST:-localhost}
port: ${KOLOSAL_QDRANT_PORT:-6333}
api_key: ${KOLOSAL_QDRANT_API_KEY:-}
faiss:
index_path: ${KOLOSAL_FAISS_INDEX_PATH:-./data/faiss_index}
# Search with API key externalization
search:
enabled: true
searxng_url: ${KOLOSAL_SEARXNG_URL:-https://searx.stream/}
api_key: ${KOLOSAL_SEARCH_API_KEY:-}
# Inference engines with relative paths
inference_engines:
- name: llama-cpu
library_path: ${KOLOSAL_ENGINE_PATH:-./build/Debug}/llama-cpu.dll
- π API Key Externalization: All API keys use environment variables
- π Relative Paths: No hardcoded absolute paths
- π Platform Independence: Works across Windows, Linux, and macOS
- π Secret Management: Supports integration with secret management systems
- π Environment Documentation: Comprehensive variable documentation
The default configuration files are designed for development use and do not include production security settings. Configure the following per deployment:
# Set API keys via environment variables (production)
export KOLOSAL_API_KEY="your-secure-api-key"
export KOLOSAL_SEARCH_API_KEY="your-search-api-key"
export KOLOSAL_QDRANT_API_KEY="your-qdrant-api-key"
# In config.yaml - configure allowed origins per deployment
auth:
cors:
enabled: true
allowed_origins:
- "https://yourdomain.com"
- "https://app.yourdomain.com"
# Configure per deployment in config.yaml
auth:
rate_limit:
enabled: true
max_requests: 100 # Adjust based on needs
window_size: 60
- Set unique API keys via environment variables
- Configure specific CORS origins (never use "*" in production)
- Enable authentication:
enabled: true
- Set appropriate rate limits for your use case
- Use HTTPS in production environments
- Regularly rotate API keys
- Monitor and log authentication attempts
- Environment Variables Guide - Complete list of supported variables
- Configuration Migration - Guide for migrating existing configs
- Example Configuration - Full example with all options
If you have existing configuration files with hardcoded security settings, follow these steps to migrate:
- Remove hardcoded API keys and CORS origins from your YAML files
- Copy
.env.template
to.env
and configure your specific values - Use environment variables for all sensitive configuration
- Never commit
.env
files to version control
Example migration:
# OLD (insecure - hardcoded values)
security:
api_key: "your-api-key-here"
allowed_origins: ["*"]
# NEW (secure - environment variables)
security:
api_key: ${KOLOSAL_API_KEY:-}
allowed_origins: ["${KOLOSAL_CORS_ORIGIN_1:-}"]
GET /status
Returns system health and agent statistics.
GET /agents
POST /agents
Content-Type: application/json
{
"name": "CustomAgent",
"capabilities": ["chat", "research"]
}
POST /agents/{agent_name}/execute
Content-Type: application/json
{
"function": "chat",
"params": {
"message": "Hello!",
"model": "qwen2.5-0.5b"
}
}
GET /workflows
POST /workflows
Content-Type: application/json
{
"id": "custom_workflow",
"name": "Custom Workflow",
"type": 0,
"steps": [
{
"id": "step1",
"agent_name": "Assistant",
"function_name": "chat",
"parameters": {"message": "Hello", "model": "qwen2.5-0.5b"}
}
]
}
POST /workflows/{id}/execute
Content-Type: application/json
{
"input_data": {"query": "AI research trends"}
}
# System status
curl http://localhost:8081/status
# List agents
curl http://localhost:8081/agents
# Simple execute with automatic tool execution (recommended)
curl -X POST http://localhost:8081/agent/execute \
-H "Content-Type: application/json" \
-d '{"query": "What are the latest AI trends?", "context": "Focus on 2025 developments"}'
# Chat with assistant
curl -X POST http://localhost:8081/agents/Assistant/execute \
-H "Content-Type: application/json" \
-d '{"function": "chat", "params": {"message": "Hello!", "model": "qwen2.5-0.5b"}}'
# Analyze text
curl -X POST http://localhost:8081/agents/Analyzer/execute \
-H "Content-Type: application/json" \
-d '{"function": "analyze", "params": {"text": "Sample text"}}'
# Document retrieval
curl -X POST http://localhost:8081/agents/RetrievalAgent/execute \
-H "Content-Type: application/json" \
-d '{"function": "retrieve_and_answer", "params": {"question": "What is AI?"}}'
# List available workflows
curl http://localhost:8081/workflows
# Execute a workflow
curl -X POST http://localhost:8081/workflows/simple_research/execute \
-H "Content-Type: application/json" \
-d '{"input_data": {"query": "machine learning trends"}}'
The system includes comprehensive testing support with automated test runners and multiple test categories.
# Windows PowerShell
.\tests\run_tests.ps1
# Linux/macOS
./tests/run_tests.sh
Category | Description | Tests Included |
---|---|---|
all |
Complete test suite | All available tests |
quick |
Fast unit tests | minimal_demo , model_interface , config_manager , workflow_config |
demo |
Demo functionality | minimal_demo , simple_demo |
workflow |
Workflow system | workflow_config , workflow_manager , workflow_orchestrator |
integration |
API & system integration | agent_execution , http_server |
retrieval |
Document retrieval | retrieval_agent |
stress |
Error & stress testing | error_scenarios |
πͺ Windows PowerShell
# Run all tests
.\tests\run_tests.ps1 -TestType all
# Run quick tests with verbose output
.\tests\run_tests.ps1 -TestType quick -VerboseOutput
# Run specific category
.\tests\run_tests.ps1 -TestType workflow
# Run with custom timeout
.\tests\run_tests.ps1 -TestType integration -TimeoutMinutes 5
# Run specific test
.\tests\run_tests.ps1 -TestType agent_execution
# Save results to file
.\tests\run_tests.ps1 -TestType all -OutputFile "test_results.json"
Available Options:
-TestType
- Test category or specific test name-BuildFirst $false
- Skip building before testing-VerboseOutput
- Enable detailed output-OutputFile
- Save results to JSON file-TimeoutMinutes
- Set timeout per test (default: 1)
π§ Linux/macOS Bash
# Make executable (first time only)
chmod +x tests/run_tests.sh
# Run all tests
./tests/run_tests.sh
# Run quick tests with verbose output
./tests/run_tests.sh --test-type quick --verbose
# Run specific category
./tests/run_tests.sh -t workflow
# Run with custom timeout
./tests/run_tests.sh --test-type integration --timeout 5
# Run specific test
./tests/run_tests.sh -t agent_execution
# Save results to file
./tests/run_tests.sh -t all -o test_results.json
Available Options:
-t, --test-type
- Test category or specific test name-b, --build-first false
- Skip building before testing-v, --verbose
- Enable detailed output-o, --output-file
- Save results to JSON file--timeout
- Set timeout per test in minutes (default: 1)-h, --help
- Show help message
# Build tests manually
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Debug -DBUILD_TESTS=ON
cmake --build . --config Debug
# Run tests via CMake
ctest --output-on-failure
# Run individual test executables
./test_agent_execution # Linux/macOS
.\Debug\test_agent_execution.exe # Windows
# Available test executables:
# - minimal_test_demo / minimal_test_demo.exe
# - test_agent_execution
# - test_config_manager
# - test_workflow_config
# - test_workflow_manager
# - test_workflow_orchestrator
# - test_http_server
# - test_model_interface
# - test_error_scenarios
# - test_retrieval_agent
β Successful Test Output:
π§ͺ Running Kolosal Agent System Tests
π Test Type: quick
π¨ Building tests... β
Build completed successfully
π Running tests...
β
minimal_demo β
PASSED (0.8s)
β
model_interface β
PASSED (1.2s)
β
config_manager β
PASSED (0.6s)
β
workflow_config β
PASSED (1.1s)
π Results: 4/4 tests passed (100% success)
β±οΈ Total time: 3.7 seconds
β Failed Test Output:
π§ͺ Running Kolosal Agent System Tests
π Running tests...
β
minimal_demo β
PASSED (0.8s)
β workflow_manager β FAILED (2.3s)
βββ Error: Workflow execution timeout
π Results: 1/2 tests passed (50% success)
β Some tests failed. Check logs for details.
# Quick verification after build
./tests/run_tests.sh -t demo
# Basic functionality check
curl http://localhost:8080/status
curl http://localhost:8080/agents
# Workflow system check
curl http://localhost:8080/workflows
Issue | Solution |
---|---|
Build fails | Ensure C++20 compiler and CMake 3.14+ |
Tests timeout | Increase timeout: --timeout 10 |
Missing dependencies | Run git submodule update --init --recursive |
Permission denied | Run chmod +x tests/run_tests.sh |
For complete testing documentation, see tests/README.md
.
#include "agent_manager.hpp"
#include "agent_config.hpp"
#include "workflow_manager.hpp"
#include "workflow_types.hpp"
int main() {
// Load configuration
auto config_manager = std::make_shared<AgentConfigManager>();
config_manager->load_config("agent.yaml");
// Create agent manager
auto agent_manager = std::make_shared<AgentManager>(config_manager);
// Create workflow manager
auto workflow_manager = std::make_shared<WorkflowManager>(agent_manager);
workflow_manager->start();
// Create workflow orchestrator
auto workflow_orchestrator = std::make_shared<WorkflowOrchestrator>(workflow_manager);
workflow_orchestrator->start();
// Create and use agent
std::string agent_id = agent_manager->create_agent("MyAgent", {"chat"});
agent_manager->start_agent(agent_id);
// Execute simple function
json params = {{"message", "Hello!"}, {"model", "gemma3-1b"}};
auto result = agent_manager->execute_function(agent_id, "chat", params);
// Execute workflow
json workflow_input = {{"query", "AI trends"}};
std::string execution_id = workflow_orchestrator->execute_workflow("simple_research", workflow_input);
// Monitor workflow execution
auto execution_status = workflow_orchestrator->get_execution_status(execution_id);
return 0;
}
import requests
import json
import time
class KolosalAgentClient:
def __init__(self, base_url="http://localhost:8080"):
self.base_url = base_url
def simple_execute(self, query, context=None, model=None, agent=None):
"""Execute query with automatic agent and tool selection (recommended)"""
payload = {"query": query}
if context:
payload["context"] = context
if model:
payload["model"] = model
if agent:
payload["agent"] = agent
response = requests.post(f"{self.base_url}/agent/execute", json=payload)
return response.json()
def create_agent(self, name, capabilities):
response = requests.post(f"{self.base_url}/agents",
json={"name": name, "capabilities": capabilities})
return response.json()
def execute_function(self, agent_name, function, params):
response = requests.post(f"{self.base_url}/agents/{agent_name}/execute",
json={"function": function, "params": params})
return response.json()
def list_workflows(self):
response = requests.get(f"{self.base_url}/workflows")
return response.json()
def execute_workflow(self, workflow_id, input_data=None):
payload = {}
if input_data:
payload["input_data"] = input_data
response = requests.post(f"{self.base_url}/workflows/{workflow_id}/execute", json=payload)
return response.json()
# Usage Examples
client = KolosalAgentClient()
# Simple execute with comprehensive tool execution (recommended)
result = client.simple_execute(
query="What is machine learning?",
context="Explain for beginners",
model="qwen2.5-0.5b"
)
print("Tools executed:", len(result["tools_executed"]))
print("Success rate:", f"{result['summary']['successful']}/{result['summary']['total_tools']}")
print("LLM Response:", result["llm_response"]["response"][:100] + "...")
# Traditional agent function call
chat_result = client.execute_function("Assistant", "chat", {
"message": "Hello from Python!",
"model": "qwen2.5-0.5b"
})
print("Agent Response:", chat_result)
# List available workflows
workflows = client.list_workflows()
print("Available Workflows:", workflows)
# Execute a workflow
workflow_result = client.execute_workflow("simple_research", {
"query": "machine learning trends 2025"
})
print("Workflow Execution ID:", workflow_result.get("execution_id"))
FROM ubuntu:22.04
RUN apt-get update && apt-get install -y \
build-essential cmake git libcurl4-openssl-dev
COPY . /app
WORKDIR /app
RUN mkdir build && cd build && \
cmake .. -DCMAKE_BUILD_TYPE=Release && \
cmake --build . --config Release
# Copy configuration files
COPY agent.yaml workflow.yaml config.yaml ./build/
EXPOSE 8081
CMD ["./build/kolosal-agent"]
- Agent Creation: Sub-second initialization
- Memory Usage: ~0.9MB per agent
- Startup Time: <5 seconds with 100 agents
- Concurrent Requests: 1000+ requests/second
- Function Execution: Low-latency with concurrent processing
- Tune
max_concurrent_requests
andworker_threads
in configuration - Use local models for faster inference
- Enable response caching for repeated queries
- Set appropriate memory limits per agent
We welcome contributions! Please see our Developer Guide for details.
# Fork and clone
git clone --recursive https://github.com/your-username/kolosal-agent.git
cd kolosal-agent
# Build with tests
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Debug -DBUILD_TESTS=ON
cmake --build . --config Debug
# Run tests
ctest --output-on-failure
# Format code (optional)
clang-format -i src/**/*.cpp include/**/*.hpp
- Fork the repository
- Create a feature branch
- Make changes and add tests
- Run the test suite
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
- nlohmann/json - JSON library for Modern C++
- yaml-cpp - YAML parsing library
- Kolosal Server - High-performance inference server integration
-
π Complete Documentation:
docs/
directory -
π Issues & Support: GitHub Issues
-
π‘ Examples & Demos:
workflows/
directory -
π§ Build Scripts: Available via VS Code tasks or direct cmake commands