Backend server component for Cogitatio Virtualis, providing vector search and document processing capabilities.
-
🟢 Vector Store: Complete
- FAISS integration
- SQLite metadata storage
- Safe index management
- Backup system
- Vector search
-
🟢 Document Processing: Complete
- File watching
- Markdown parsing
- Vector generation
- Metadata extraction
- Type validation
-
🟡 API Layer: In Progress
- FastAPI routes ✓
- Document endpoints ✓
- Search implementation ✓
- Response optimization
⚠️
-
🔴 Testing: Not Started
- No testing infrastructure currently implemented
Python >= 3.8
pip >= 20.0
# Install package in editable mode with development dependencies
pip install -e ".[dev]"
Start both API server and document watcher:
python -m cogitatio.scripts.start_server
Start API server only:
python -m cogitatio.scripts.start_server --api-only
Start document processor watcher only:
python -m cogitatio.scripts.start_server --processor-only --watch-only
Copy .env.example
to .env
and configure:
COGITATIO_ENV=development
VOYAGE_MODEL=voyage-3
VOYAGE_API_KEY=your_key_here
DATA_DIR=./data
DOCUMENTS_DIR=./documents
COGITATIO_LOG_PATH=./logs
HOST=127.0.0.1
PORT=8000
cogitatio-server/
├── pyproject.toml
├── requirements.txt
├── scripts/
│ ├── start_server.py
│ └── db_tools/
│ ├── db_explorer.py
│ └── vector_visualizer.py
└── cogitatio/
├── api/
│ └── routes.py
├── document_processor/
│ ├── config.py
│ ├── document_store.py
│ ├── processor.py
│ ├── monitor.py
│ └── vector_manager.py
├── types/
│ └── schemas.py
└── utils/
└── logging.py
- Real-time document monitoring
- Markdown and YAML frontmatter parsing
- Automatic vector embedding generation
- Document chunking and metadata extraction
- Type validation and schema enforcement
- FAISS vector index management
- SQLite metadata database
- Atomic write operations
- Automatic backup system
- Safe index updates
GET /health - Health check
GET /stats - Database statistics
GET /documents/{doc_id} - Get document by ID
POST /search - Vector search
- Similarity search
- Semantic search
- HyDE (Hypothetical Document Embeddings)
- Metadata filtering
- Document reconstruction
View and analyze the vector database:
python -m cogitatio.scripts.db_tools.db_explorer --data-dir ./data
Commands:
stats - Show database statistics
doc <doc_id> - Search by document ID
similar <vector_id> - Find similar vectors
type <doc_type> - Search by document type
Visualize the vector space:
python -m cogitatio.scripts.db_tools.vector_visualizer
Features:
- 2D/3D visualization
- Interactive clustering
- Color coding by document type
- Real-time updates
- Dimension reduction view
- Automatic index recovery
- Safe write operations
- Backup management
- Structured logging
- Operation tracking
- Batch vector processing
- Connection pooling
- Query optimization
- Efficient chunking
- Atomic operations
EXPERIENCE # Professional experience
EDUCATION # Educational background
PROJECT # Project documentation
OTHER # Additional document types
VECTOR_DIMENSION = 1024 # Embedding dimension
BATCH_SIZE = 100 # Vectors per batch
MAX_TOKENS = 32000 # Context length
DEBOUNCE_SECONDS = 1.0 # File change debounce
IGNORED_PATHS = { # Ignored patterns
'*/templates/*',
'*/.git/*',
'*/node_modules/*'
}
Backend architecture and vector implementation part of the Cogitatio Virtualis project.
TODO...