Skip to content

Architecture

Michael Haag edited this page Jul 23, 2024 · 1 revision

ShellSweepX Architecture

ShellSweepX is a web application designed for detecting and analyzing potential webshells. This document outlines the high-level architecture of the system.

System Components

  1. Web Application (FastAPI): The core of ShellSweepX, handling HTTP requests and WebSocket connections.
  2. Database (SQLite): Stores findings, agent information, and other relevant data.
  3. Machine Learning Model: Used for initial classification of files as potential webshells.
  4. YARA Rules Engine: Performs pattern matching on files using custom YARA rules.
  5. AI Integration: Utilizes GPT or Claude for in-depth analysis of potential webshells.
  6. Agent System: Allows remote scanning and reporting from distributed agents.

Architecture Diagram

Key Database Interactions

  1. Findings Table:

    • Stores information about analyzed files
    • Fields: id, sha256, file_name, result, file_size, content, feedback, analysis, entropy, std_dev, vt_score, yara_matches, created_at, submitted_timestamp, last_analyzed_timestamp, submitting_agents
  2. Agents Table:

    • Tracks information about remote agents
    • Fields: id, agent_id, computer_name, last_checkin

Main Functionalities

  1. File Upload and Analysis:

    • Files are uploaded and analyzed using the ML model
    • Results are stored in the findings table
    • YARA rules are applied if enabled
  2. Agent Reporting:

    • Remote agents scan files and report results
    • Findings are stored in the database
    • Agent check-ins are recorded
  3. AI-Powered Triage:

    • Suspicious files can be further analyzed using GPT or Claude
    • AI analysis results are stored in the findings table
  4. YARA Rule Management:

    • Custom YARA rules can be added, updated, or deleted
    • Rules are stored as files in the YARA_RULES_DIR
  5. Dashboard and Reporting:

    • Provides an overview of recent detections and statistics
    • Generates charts for trend analysis and webshell type distribution
  6. Real-time Updates:

    • Uses WebSockets to broadcast updates to connected clients
  7. Settings Management:

    • Allows configuration of API keys, YARA settings, and AI prompts
    • Settings are encrypted and stored in a file

Data Flow

  1. Files are uploaded or reported by agents
  2. The ML model performs initial classification
  3. YARA rules are applied if enabled
  4. Results are stored in the database
  5. Suspicious files can be triaged with AI for deeper analysis
  6. Dashboard and reports are generated from the stored data
  7. Real-time updates are broadcast to connected clients

This architecture allows ShellSweepX to efficiently process and analyze files, store results, and provide real-time updates to users, while also supporting distributed scanning through the agent system.