-
Notifications
You must be signed in to change notification settings - Fork 1
feat: telemetry and security hardening #52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…on guide This commit introduces a detailed PostHog instrumentation blueprint for the OpenChat monorepo, covering web, server, and extension components. It outlines baseline setup, event taxonomy with triggers and properties, common properties, dashboards and insights strategies, implementation notes by file, and next steps for full analytics coverage. Key additions include event naming conventions, super-property registrations, enhanced event properties for auth, chat, settings, sync, and extension features, as well as recommended dashboard KPIs and monitoring plans to track acquisition, onboarding, chat health, personalization, and extension engagement. This documentation will guide engineering alignment and implementation to enable actionable analytics through PostHog. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
…trumentation - Reduced and optimized core event sets to focus on high-leverage insights, minimizing low-signal noise. - Standardized event naming conventions to lowercase snake_case with scope.action structure. - Updated super-properties registration for better session identification including guest handling. - Enhanced event properties for chat creation, streaming, attachments, and OpenRouter API key lifecycle. - Added new telemetry events for sync connection state, fallback storage usage, and chat rate limiting. - Revised dashboard and funnel definitions to reflect streamlined event model and metrics. - Improved implementation notes for consistency and expanded coverage across web, server, and extension components. This update enables effective monitoring of acquisition, chat usage, reliability, and onboarding activation without excess event volume. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
- Add base super properties and sanitize event properties for client and server PostHog usage - Track fallback storage usage in server router for chats and messages - Add detailed client event tracking in web chat handler and components - Introduce PosthogBootstrap component for consistent client identification and property registration - Enhance event tracking with user workspace IDs, auth states, theme info, and OpenRouter API key status - Track marketing CTA clicks and landing visits with detailed metadata - Improve rate limit and error event captures for chat and OpenRouter APIs - Add client property registration on model selection and OpenRouter key management - Add sync connection state events to PostHog This update improves analytics granularity and consistency across client and server, enabling better user behavior insights and troubleshooting. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR implements comprehensive PostHog telemetry instrumentation across the OpenChat application and introduces important security hardening measures. The changes add detailed analytics tracking to both web and server components, capturing user interactions, system performance metrics, and error patterns. Key telemetry additions include chat creation/messaging events, OpenRouter API key management tracking, WebSocket connection monitoring, and marketing analytics for the landing page.
The security enhancements focus on two critical areas: (1) sanitizing user IDs before injecting them into inline JavaScript to prevent XSS vulnerabilities, and (2) upgrading OpenRouter API key encryption from simple SHA-256 hashing to PBKDF2 with 100,000 iterations for stronger protection against dictionary and rainbow table attacks. A new PosthogBootstrap component handles client-side analytics initialization with proper user identification for both authenticated and guest users.
The telemetry infrastructure follows a consistent event naming pattern (scope.action) and includes comprehensive property sanitization to ensure data quality. All PostHog integrations maintain user privacy through techniques like IP address hashing and masking sensitive data while providing rich analytics context.
Important Files Changed
Changed Files
| Filename | Score | Overview |
|---|---|---|
| apps/server/src/lib/openrouter.ts | 4/5 | Enhanced OpenRouter API key encryption from SHA-256 to PBKDF2 with 100k iterations for stronger security |
| apps/web/src/app/dashboard/layout.tsx | 3/5 | Added custom string sanitization for user IDs in inline JavaScript to prevent XSS vulnerabilities |
| apps/web/src/components/posthog-bootstrap.tsx | 4/5 | New component that initializes PostHog client-side tracking with proper user identification and state management |
| apps/web/src/lib/posthog.ts | 4/5 | Enhanced client-side PostHog integration with property sanitization and workspace grouping functionality |
| apps/server/src/lib/posthog.ts | 5/5 | Comprehensive server-side PostHog configuration with environment handling and graceful shutdown |
| apps/web/src/lib/posthog-server.ts | 4/5 | Server-side PostHog telemetry with environment-based configuration and property sanitization |
| apps/server/src/routers/index.ts | 4/5 | Added telemetry tracking for chat operations and fallback storage usage with consistent event patterns |
| apps/web/src/app/api/chat/chat-handler.ts | 4/5 | Comprehensive chat API telemetry with IP hashing for privacy and enhanced rate limiting headers |
| apps/web/src/components/hero-section.tsx | 4/5 | Converted to client component with marketing analytics for CTA clicks and landing page visits |
| apps/web/src/components/app-sidebar.tsx | 4/5 | Added telemetry for dashboard entry and chat creation events with detailed user state tracking |
| apps/web/src/components/chat-room.tsx | 4/5 | Comprehensive analytics for chat interactions, API key events, and error tracking with proper deduplication |
| apps/web/src/components/chat-composer.tsx | 4/5 | Added attachment event tracking with privacy protection via data-ph-no-capture attribute |
| apps/web/src/components/providers.tsx | 4/5 | Enhanced PostHog initialization with referrer tracking and PosthogBootstrap component integration |
| apps/web/src/lib/sync.ts | 5/5 | Added WebSocket connection state telemetry with retry count and tab ID tracking |
| apps/web/src/components/model-selector.tsx | 5/5 | Simple PostHog integration for model selection event tracking without breaking changes |
| apps/web/src/components/account-settings-modal.tsx | 5/5 | Privacy-conscious telemetry for OpenRouter API key management with masked sensitive data |
| apps/web/src/components/openrouter-link-modal.tsx | 4/5 | Added telemetry for API key prompt modal with ref-based duplicate event prevention |
| posthog.md | 5/5 | Comprehensive telemetry documentation defining 14 core events and implementation guidelines |
Confidence score: 3/5
- This PR requires careful review due to custom security implementations and extensive telemetry changes that could impact system performance
- Score lowered due to custom XSS sanitization implementation in layout.tsx where well-tested libraries would be safer, potential performance impact from extensive telemetry instrumentation, and the complexity of changes across multiple critical components
- Pay close attention to apps/web/src/app/dashboard/layout.tsx for the custom sanitization logic and apps/server/src/lib/openrouter.ts for the PBKDF2 implementation
Sequence Diagram
sequenceDiagram
participant User
participant Web as "Web App"
participant PostHog as "PostHog Client"
participant Server as "Server API"
participant OpenRouter as "OpenRouter API"
participant DB as "Database"
User->>Web: "Visit landing page"
Web->>PostHog: "Capture marketing.visit_landing event"
User->>Web: "Click 'Try OpenChat' CTA"
Web->>PostHog: "Capture marketing.cta_clicked event"
User->>Web: "Navigate to dashboard"
Web->>Web: "Bootstrap PostHog with sanitized user ID"
Web->>PostHog: "Identify client with workspace context"
Web->>PostHog: "Capture dashboard.entered event"
User->>Web: "Click 'New Chat'"
Web->>Server: "POST /chats/create"
Server->>DB: "Insert new chat record"
alt Database available
DB-->>Server: "Success"
Server->>PostHog: "Capture chat.created event (postgres backend)"
else Database unavailable
Server->>Server: "Use memory fallback"
Server->>PostHog: "Capture workspace.fallback_storage_used event"
Server->>PostHog: "Capture chat.created event (memory backend)"
end
Server-->>Web: "Return chat ID and storage backend"
Web->>PostHog: "Capture chat.created event (client-side)"
User->>Web: "Enter OpenRouter API key"
Web->>Web: "Encrypt API key with PBKDF2-derived AES key"
Web->>Server: "Store encrypted API key"
Server->>Server: "Derive encryption key using PBKDF2"
Server->>Server: "Encrypt API key with AES-256-GCM"
Server->>DB: "Store encrypted key in accounts table"
Web->>PostHog: "Capture openrouter.key_saved event"
Web->>PostHog: "Register has_openrouter_key: true property"
User->>Web: "Send chat message"
Web->>PostHog: "Capture chat_message_submitted event"
Web->>Server: "POST /api/chat with message and model"
Server->>Server: "Validate request origin and rate limits"
alt Rate limited
Server->>PostHog: "Capture chat.rate_limited event"
Server-->>Web: "HTTP 429 Too Many Requests"
else Within limits
Server->>Server: "Decrypt OpenRouter API key"
Server->>Server: "Create streaming response handler"
Server->>OpenRouter: "Stream chat completion request"
OpenRouter-->>Server: "Stream response chunks"
Server->>Server: "Persist message chunks via streamUpsert"
Server->>DB: "Upsert message content incrementally"
Server->>PostHog: "Capture chat_message_stream event with metrics"
Server-->>Web: "Stream response to client"
end
User->>Web: "Attach file to message"
Web->>Web: "Validate file size against 5MB limit"
alt File too large
Web->>PostHog: "Capture chat.attachment_event (rejected)"
else File accepted
Web->>PostHog: "Capture chat.attachment_event (accepted)"
end
Context used:
- Context from
dashboard- AGENTS.md (source)
18 files reviewed, 8 comments
| throw new Error("Missing OPENROUTER_API_KEY_SECRET env for encrypting OpenRouter API keys"); | ||
| } | ||
| return createHash("sha256").update(secret).digest(); | ||
| const salt = "openrouter:key-derivation-salt"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Using a fixed salt reduces security compared to random salts. Consider generating unique salts per user or per environment.
| identifyClient(currentUserId, { | ||
| workspaceId: currentUserId, | ||
| properties: { auth_state: session?.user ? "member" : "guest" }, | ||
| }); | ||
| registerClientProperties({ | ||
| auth_state: session?.user ? "member" : "guest", | ||
| workspace_id: currentUserId, | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Consider moving the telemetry logic into a custom hook like useUserTelemetry() to keep the component focused and make the tracking reusable
|
|
||
| useEffect(() => { | ||
| if (!resolvedWorkspaceId) return; | ||
| if (identifyRef.current === resolvedWorkspaceId && session?.user) return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: The condition && session?.user may cause re-identification when session transitions from null to defined user, even with same workspace ID. Consider checking session state in dependency array.
| const charMap: Record<string, string> = { | ||
| "<": "\\u003C", | ||
| ">": "\\u003E", | ||
| "/": "\\u002F", | ||
| "\\": "\\\\", | ||
| "\u0008": "\\b", | ||
| "\u000c": "\\f", | ||
| "\u000a": "\\n", | ||
| "\u000d": "\\r", | ||
| "\u0009": "\\t", | ||
| "\u0000": "\\0", | ||
| "\u2028": "\\u2028", | ||
| "\u2029": "\\u2029", | ||
| }; | ||
|
|
||
| function escapeRegexChar(char: string): string { | ||
| const code = char.charCodeAt(0); | ||
| if (char === "\\" || char === "]" || char === "^" || char === "-") { | ||
| return `\\${char}`; | ||
| } | ||
| if (char === "/") return "\\/"; | ||
| if (code < 0x20 || char === "\u2028" || char === "\u2029") { | ||
| return `\\u${code.toString(16).padStart(4, "0")}`; | ||
| } | ||
| return char; | ||
| } | ||
|
|
||
| const UNSAFE_PATTERN = new RegExp(`[${Object.keys(charMap).map(escapeRegexChar).join("")}]`, "g"); | ||
|
|
||
| function escapeUnsafeChars(str: string): string { | ||
| return str.replace(UNSAFE_PATTERN, (char) => charMap[char] ?? char); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Custom string escaping implementation adds complexity. Consider using serialize-javascript library or Next.js built-in utilities for safer, well-tested JSON serialization for scripts.
| if (char === "\\" || char === "]" || char === "^" || char === "-") { | ||
| return `\\${char}`; | ||
| } | ||
| if (char === "/") return "\\/"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Forward slash escaping in regex may be unnecessary since it's not a special regex character in character classes.
| useEffect(() => { | ||
| if (visitTrackedRef.current) return | ||
| if (typeof session === 'undefined') return | ||
| visitTrackedRef.current = true | ||
| const referrerUrl = document.referrer && document.referrer.length > 0 ? document.referrer : 'direct' | ||
| let referrerDomain = 'direct' | ||
| if (referrerUrl !== 'direct') { | ||
| try { | ||
| referrerDomain = new URL(referrerUrl).hostname | ||
| } catch { | ||
| referrerDomain = 'direct' | ||
| } | ||
| } | ||
| let utmSource: string | null = null | ||
| try { | ||
| const params = new URLSearchParams(window.location.search) | ||
| const source = params.get('utm_source') | ||
| if (source && source.length > 0) { | ||
| utmSource = source | ||
| } | ||
| } catch { | ||
| utmSource = null | ||
| } | ||
| const entryPath = window.location.pathname || '/' | ||
| captureClientEvent('marketing.visit_landing', { | ||
| referrer_url: referrerUrl, | ||
| referrer_domain: referrerDomain, | ||
| utm_source: utmSource ?? undefined, | ||
| entry_path: entryPath, | ||
| session_is_guest: !session?.user, | ||
| }) | ||
| }, [session]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: The useEffect runs whenever session changes, but visitTrackedRef only prevents re-execution after the first successful run. If session transitions from undefined to defined multiple times, this could potentially fire multiple events despite the ref guard.
| function screenWidthBucket(width: number) { | ||
| if (width < 640) return 'xs' | ||
| if (width < 768) return 'sm' | ||
| if (width < 1024) return 'md' | ||
| if (width < 1280) return 'lg' | ||
| return 'xl' | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Consider handling edge cases for very large screens (>1280px) and very small screens (<320px) with explicit buckets rather than defaulting to 'xl'.
|
|
||
| function hashClientIp(ip: string): string { | ||
| try { | ||
| return createHash("sha256").update(ip).digest("hex").slice(0, 16); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Consider using a longer truncation (e.g., 32 chars) for better collision resistance while maintaining privacy
Summary
Testing