- 
                Notifications
    You must be signed in to change notification settings 
- Fork 1
feat: telemetry and security hardening #56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…on guide This commit introduces a detailed PostHog instrumentation blueprint for the OpenChat monorepo, covering web, server, and extension components. It outlines baseline setup, event taxonomy with triggers and properties, common properties, dashboards and insights strategies, implementation notes by file, and next steps for full analytics coverage. Key additions include event naming conventions, super-property registrations, enhanced event properties for auth, chat, settings, sync, and extension features, as well as recommended dashboard KPIs and monitoring plans to track acquisition, onboarding, chat health, personalization, and extension engagement. This documentation will guide engineering alignment and implementation to enable actionable analytics through PostHog. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
…trumentation - Reduced and optimized core event sets to focus on high-leverage insights, minimizing low-signal noise. - Standardized event naming conventions to lowercase snake_case with scope.action structure. - Updated super-properties registration for better session identification including guest handling. - Enhanced event properties for chat creation, streaming, attachments, and OpenRouter API key lifecycle. - Added new telemetry events for sync connection state, fallback storage usage, and chat rate limiting. - Revised dashboard and funnel definitions to reflect streamlined event model and metrics. - Improved implementation notes for consistency and expanded coverage across web, server, and extension components. This update enables effective monitoring of acquisition, chat usage, reliability, and onboarding activation without excess event volume. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
- Add base super properties and sanitize event properties for client and server PostHog usage - Track fallback storage usage in server router for chats and messages - Add detailed client event tracking in web chat handler and components - Introduce PosthogBootstrap component for consistent client identification and property registration - Enhance event tracking with user workspace IDs, auth states, theme info, and OpenRouter API key status - Track marketing CTA clicks and landing visits with detailed metadata - Improve rate limit and error event captures for chat and OpenRouter APIs - Add client property registration on model selection and OpenRouter key management - Add sync connection state events to PostHog This update improves analytics granularity and consistency across client and server, enabling better user behavior insights and troubleshooting. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR implements comprehensive telemetry infrastructure using PostHog analytics and enhances security hardening across the OpenChat application. The changes introduce systematic event tracking throughout the user journey - from landing page visits and dashboard entries to chat creation and file attachments. PostHog integration includes client-side user identification, server-side event capture with environment-aware metadata, and property sanitization to prevent undefined values from corrupting analytics data.
The security improvements focus on two critical areas: (1) Enhanced user ID sanitization in dashboard bootstrap scripts using Unicode escape sequences to prevent XSS attacks when injecting user data into inline JavaScript, and (2) Upgraded OpenRouter API key encryption from simple SHA-256 hashing to PBKDF2 with 100,000 iterations, including versioned encryption with legacy fallback support for backward compatibility.
The telemetry system captures key product metrics including authentication states, chat operations, model selections, API key usage patterns, rate limiting events, and system fallback scenarios when PostgreSQL is unavailable. All changes maintain existing functionality while adding comprehensive observability and stronger security protections.
Important Files Changed
Changed Files
| Filename | Score | Overview | 
|---|---|---|
| apps/web/src/app/dashboard/layout.tsx | 5/5 | Critical security hardening for user ID sanitization using Unicode escapes to prevent XSS in inline scripts | 
| apps/server/src/lib/openrouter.ts | 4/5 | Major security upgrade implementing PBKDF2 key derivation with legacy fallback for API key encryption | 
| apps/web/src/components/posthog-bootstrap.tsx | 4/5 | New client-side PostHog initialization component with comprehensive user property tracking | 
| apps/web/src/lib/posthog.ts | 4/5 | Core PostHog client implementation with data sanitization and environment-aware configuration | 
| apps/server/src/lib/posthog.ts | 4/5 | Server-side PostHog integration with property sanitization and deployment metadata | 
| apps/web/src/lib/posthog-server.ts | 4/5 | Server-side PostHog infrastructure with multi-platform deployment support | 
| apps/web/src/app/api/chat/chat-handler.ts | 4/5 | Enhanced chat API telemetry with IP hashing for privacy and comprehensive error tracking | 
| apps/server/src/routers/index.ts | 4/5 | Chat router telemetry improvements with database fallback monitoring | 
| apps/web/src/components/app-sidebar.tsx | 4/5 | Comprehensive sidebar telemetry including dashboard entry and chat creation tracking | 
| apps/web/src/components/chat-room.tsx | 4/5 | Chat room telemetry with error tracking and user interaction analytics | 
| apps/web/src/components/providers.tsx | 3/5 | PostHog provider setup with enhanced pageview tracking and referrer analysis | 
| apps/web/src/components/hero-section.tsx | 4/5 | Landing page telemetry conversion from server to client component for marketing analytics | 
| apps/web/src/components/chat-composer.tsx | 4/5 | File attachment telemetry tracking with privacy-conscious implementation | 
| apps/web/src/components/model-selector.tsx | 4/5 | Model selection telemetry for understanding user preferences | 
| apps/web/src/components/account-settings-modal.tsx | 4/5 | OpenRouter API key management telemetry with masked data logging | 
| apps/web/src/components/openrouter-link-modal.tsx | 4/5 | API key prompt tracking with deduplication logic | 
| apps/web/src/lib/sync.ts | 4/5 | WebSocket sync telemetry for connection state monitoring | 
Confidence score: 4/5
- This PR implements comprehensive telemetry and security hardening with minimal risk to existing functionality
- Score reflects strong security improvements and well-structured telemetry implementation, but lowered due to complex client-server coordination and potential SSR hydration issues in providers.tsx
- Pay special attention to apps/web/src/components/providers.tsx for potential hydration mismatches from synchronous DOM access
Sequence Diagram
sequenceDiagram
    participant User
    participant WebApp
    participant Server
    participant PostHog
    participant OpenRouter
    
    User->>WebApp: "Visit dashboard"
    WebApp->>WebApp: "Initialize PostHog client"
    WebApp->>PostHog: "Register client properties (theme, auth_state)"
    WebApp->>Server: "Load chat list"
    Server->>PostHog: "capturePosthogEvent('workspace.fallback_storage_used')"
    
    User->>WebApp: "Add OpenRouter API key"
    WebApp->>WebApp: "Encrypt key with AES-256-GCM + PBKDF2"
    WebApp->>WebApp: "Store encrypted key locally"
    WebApp->>PostHog: "captureClientEvent('openrouter.key_saved')"
    
    User->>WebApp: "Send chat message"
    WebApp->>Server: "POST /api/chat with encrypted payload"
    Server->>Server: "Validate request origin & rate limits"
    Server->>Server: "Decrypt API key with legacy fallback"
    Server->>OpenRouter: "Stream chat completion"
    Server->>PostHog: "capturePosthogEvent('chat_message_stream')"
    Server->>WebApp: "Stream response chunks"
    Server->>Server: "Persist messages via streamUpsert"
    Server->>PostHog: "capturePosthogEvent('workspace.fallback_storage_used')"
    Context used:
- Context from dashboard- AGENTS.md (source)
17 files reviewed, 8 comments
| setInternalValue(currentValue) | ||
| } | ||
| onChange?.(currentValue) | ||
| registerClientProperties({ model_id: currentValue }) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: This creates duplicate tracking events - the useEffect on line 96 will also fire when currentValue becomes the new selectedValue
| process.env.POSTHOG_DEPLOYMENT_REGION ?? process.env.VERCEL_REGION ?? "local"; | ||
|  | ||
| const BASE_SUPER_PROPERTIES = Object.freeze({ | ||
| app: "openchat-server", | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: App name 'openchat-server' should match the web app name for consistency
|  | ||
| useEffect(() => { | ||
| if (!resolvedWorkspaceId) return; | ||
| if (identifyRef.current === resolvedWorkspaceId && session?.user) return; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: potential race condition - if session?.user changes but resolvedWorkspaceId stays the same, the identify call might be skipped when it should run
| void (async () => { | ||
| let hasKey = false; | ||
| try { | ||
| const key = await loadOpenRouterKey(); | ||
| hasKey = Boolean(key); | ||
| registerClientProperties({ has_openrouter_key: hasKey }); | ||
| } catch { | ||
| hasKey = false; | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: The async OpenRouter key loading could fail and set hasKey to false in catch block - this might mask legitimate errors vs missing keys
| const referrerUrl = document.referrer && document.referrer.length > 0 ? document.referrer : "direct"; | ||
| let referrerDomain = "direct"; | ||
| if (referrerUrl !== "direct") { | ||
| try { | ||
| referrerDomain = new URL(referrerUrl).hostname; | ||
| } catch { | ||
| referrerDomain = "direct"; | ||
| } | ||
| } | ||
| const entryPath = window.location.pathname || "/"; | ||
| const entryQuery = window.location.search || ""; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Direct DOM access in useEffect could cause hydration mismatches in Next.js. Consider using a client-only wrapper or checking typeof window !== 'undefined' before accessing window properties.
| if (!(error as any)?.__posthogTracked) { | ||
| const status = typeof (error as any)?.status === "number" ? (error as any).status : 0; | ||
| let providerHost = "openrouter.ai"; | ||
| const providerUrl = (error as any)?.providerUrl; | ||
| if (typeof providerUrl === "string" && providerUrl.length > 0) { | ||
| try { | ||
| providerHost = new URL(providerUrl).host; | ||
| } catch { | ||
| providerHost = "openrouter.ai"; | ||
| } | ||
| } | ||
| captureClientEvent("openrouter.models_fetch_failed", { | ||
| status, | ||
| error_message: | ||
| error instanceof Error && error.message ? error.message : "Failed to load OpenRouter models.", | ||
| provider_host: providerHost, | ||
| has_api_key: Boolean(key), | ||
| }); | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Duplicate error tracking logic exists in both the main error path (lines 105-115) and catch block. Consider extracting this to a helper function to reduce code duplication and ensure consistency.
| registerClientProperties({ has_openrouter_key: true }); | ||
| captureClientEvent("openrouter.key_saved", { | ||
| source: "modal", | ||
| masked_tail: key.slice(-4), | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Logging the masked tail of API keys (even just 4 characters) could be a security risk. Consider if this telemetry data is necessary or use a hash instead.
| throw Object.assign(new Error(errorMessage), { | ||
| __posthogTracked: true, | ||
| status: response.status, | ||
| providerUrl: response.url, | ||
| }); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Using Object.assign to add properties to Error instances is unconventional. Consider creating a custom error class or using a more standard approach for error metadata.
Summary
Testing