Skip to content

Conversation

will-holley
Copy link
Contributor

@will-holley will-holley commented Oct 8, 2025

Add estimated probes display below "Estimated Duration" in red team setup review page.

image

Create shared EstimatedProbesDisplay component showing attack count below
Estimated Duration in review page. Refactor both Review and Strategies views
to use consistent "Estimated Probes" label instead of duplicate code.

- Create EstimatedProbesDisplay component with memoization and theming
- Update Review page to show estimated probes below duration
- Refactor Strategies view to use shared component
- Add comprehensive test coverage (30 new tests)
- Remove 48 lines of duplicate code

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@will-holley will-holley self-assigned this Oct 8, 2025
- Add prompts, applicationDefinition, and entities to mockConfig in test
- Fix TypeScript build error
- Apply code formatting

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@will-holley will-holley marked this pull request as ready for review October 8, 2025 20:41
@will-holley will-holley requested a review from a team October 8, 2025 20:41
@will-holley will-holley changed the title feat: add estimated probes display to red team review page feat(webui): add estimated probes display to red team review page Oct 8, 2025
Copy link
Contributor

coderabbitai bot commented Oct 8, 2025

📝 Walkthrough

Walkthrough

Adds a new EstimatedProbesDisplay React component that computes and displays an estimated probe count with memoization, theming, optional tooltip content, and sx styling. Integrates this component into Review and Strategies to replace inline probe UI. Introduces a comprehensive test suite for EstimatedProbesDisplay covering rendering, formatting, memoization, tooltip behavior, theming, edge cases, and structure. Extends Review tests to mock strategy utilities and validate probe display, formatting, recalculation on config changes, and zero handling. Removes direct getEstimatedProbes usage and related Tooltip/Icon in Strategies in favor of the new component. No existing public exports are modified; one new export is added.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Scope: 4 files; one new component, integration in two components, extensive tests in two files.
  • Complexity: Runtime logic is straightforward; tests are dense with varied scenarios and mocks.
  • Heterogeneity: Mix of UI abstraction, memoization use, test utilities, and theme/tooltip checks.
  • Repetition: Some repeated formatting and tooltip assertions lower per-test complexity.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title clearly and concisely describes the primary change of adding an estimated probes display to the red team review page, matching the PR objectives without extraneous details.
Description Check ✅ Passed The pull request description directly explains the addition of the estimated probes display below the “Estimated Duration” and includes a relevant screenshot, making it clearly on-topic and informative for reviewers.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch will-holley/redteam-attack-count

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (6)
src/app/src/pages/redteam/setup/components/Review.test.tsx (3)

50-53: Hoist strategies/utils mock for clarity and ESM safety

Although Vitest hoists vi.mock, placing this mock with the other top-level mocks before importing modules reduces confusion and avoids edge cases with ESM import timing.


289-315: Test does not verify relative order as claimed

The test description says “renders estimated probes below estimated duration” but only checks existence. Assert DOM order explicitly.

Apply this addition to assert order:

       expect(estimatedDurationBox).toBeInTheDocument();
       expect(estimatedProbesBox).toBeInTheDocument();
+      // Assert that probes box follows duration box in the DOM
+      const order = estimatedDurationBox!.compareDocumentPosition(estimatedProbesBox!);
+      expect(order & Node.DOCUMENT_POSITION_FOLLOWING).toBeTruthy();

89-104: Wrap Review renders in ThemeProvider

Review uses MUI theme. Wrap renders in a ThemeProvider to prevent future theme-dependent regressions. As per coding guidelines.

Also applies to: 203-221

src/app/src/pages/redteam/setup/components/Review.tsx (1)

1051-1053: Make duration tooltip icon keyboard-accessible

Wrap the info icon in an IconButton with an aria-label to enable focus and keyboard-triggered tooltip.

Apply this diff:

-          <Tooltip title="Estimated time includes test generation and probe execution. Actual time may vary based on target response times and network conditions.">
-            <InfoOutlinedIcon sx={{ fontSize: 16, color: 'text.secondary', cursor: 'help' }} />
-          </Tooltip>
+          <Tooltip title="Estimated time includes test generation and probe execution. Actual time may vary based on target response times and network conditions.">
+            <IconButton
+              size="small"
+              aria-label="Estimated duration help"
+              sx={{ p: 0.25, color: 'text.secondary' }}
+            >
+              <InfoOutlinedIcon sx={{ fontSize: 16 }} />
+            </IconButton>
+          </Tooltip>
src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.test.tsx (1)

66-72: Prefer user-centric queries for the info icon

Avoid document.querySelector; use Testing Library queries. Expose the icon via an accessible control (aria-label) and query with getByLabelText. As per coding guidelines.

After updating the component (wrap icon in IconButton with aria-label), update tests like:

-const infoIcon = document.querySelector('[data-testid="InfoOutlinedIcon"]');
-expect(infoIcon).toBeInTheDocument();
+const infoButton = screen.getByLabelText(/estimated probes help/i);
+expect(infoButton).toBeInTheDocument();

- await user.hover(infoIcon);
+ await user.hover(infoButton);

Also applies to: 182-201, 211-221, 223-237

src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.tsx (1)

29-29: Optional: refine memo dependencies

If config identity changes frequently, consider depending on only the fields that affect estimation (numTests, plugins, strategies) to reduce unnecessary recalculation.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5aea733 and 3c80d81.

📒 Files selected for processing (5)
  • src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.test.tsx (1 hunks)
  • src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.tsx (1 hunks)
  • src/app/src/pages/redteam/setup/components/Review.test.tsx (3 hunks)
  • src/app/src/pages/redteam/setup/components/Review.tsx (2 hunks)
  • src/app/src/pages/redteam/setup/components/Strategies.tsx (2 hunks)
🧰 Additional context used
📓 Path-based instructions (8)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/gh-cli-workflow.mdc)

Prefer not to introduce new TypeScript types; use existing interfaces whenever possible

Files:

  • src/app/src/pages/redteam/setup/components/Review.tsx
  • src/app/src/pages/redteam/setup/components/Strategies.tsx
  • src/app/src/pages/redteam/setup/components/Review.test.tsx
  • src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.test.tsx
  • src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.tsx
src/app/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (src/app/CLAUDE.md)

src/app/src/**/*.{ts,tsx}: Never use fetch() directly; always use callApi() from @app/utils/api for all HTTP requests
Access Zustand state outside React components via store.getState(); do not call hooks outside components
Use the @app/* path alias for internal imports as configured in Vite

Files:

  • src/app/src/pages/redteam/setup/components/Review.tsx
  • src/app/src/pages/redteam/setup/components/Strategies.tsx
  • src/app/src/pages/redteam/setup/components/Review.test.tsx
  • src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.test.tsx
  • src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.tsx
src/app/src/{components,pages}/**/*.tsx

📄 CodeRabbit inference engine (src/app/CLAUDE.md)

src/app/src/{components,pages}/**/*.tsx: Use the class-based ErrorBoundary component (@app/components/ErrorBoundary) to wrap error-prone UI
Access theme via useTheme() from @mui/material/styles instead of hardcoding theme values
Use useMemo/useCallback only when profiling indicates benefit; avoid unnecessary memoization
Implement explicit loading and error states for components performing async operations
Prefer MUI composition and the sx prop for styling over ad-hoc inline styles

Files:

  • src/app/src/pages/redteam/setup/components/Review.tsx
  • src/app/src/pages/redteam/setup/components/Strategies.tsx
  • src/app/src/pages/redteam/setup/components/Review.test.tsx
  • src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.test.tsx
  • src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.tsx
**/*.{tsx,jsx}

📄 CodeRabbit inference engine (.cursor/rules/react-components.mdc)

**/*.{tsx,jsx}: Use icons from @mui/icons-material
Prefer commonly used icons from @mui/icons-material for intuitive experience

Files:

  • src/app/src/pages/redteam/setup/components/Review.tsx
  • src/app/src/pages/redteam/setup/components/Strategies.tsx
  • src/app/src/pages/redteam/setup/components/Review.test.tsx
  • src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.test.tsx
  • src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.tsx
src/app/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

In the React app (src/app), use callApi from @app/utils/api for all API calls instead of fetch()

Files:

  • src/app/src/pages/redteam/setup/components/Review.tsx
  • src/app/src/pages/redteam/setup/components/Strategies.tsx
  • src/app/src/pages/redteam/setup/components/Review.test.tsx
  • src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.test.tsx
  • src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.tsx
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx,js,jsx}: Keep a consistent import order (Biome handles sorting)
Use consistent curly braces for all control statements
Prefer const over let and avoid var
Use object shorthand syntax when possible
Use async/await for asynchronous code
Always sanitize sensitive data before logging; pass potentially sensitive context as the logger’s second argument for automatic sanitization
Do not log raw secrets; avoid interpolating headers/bodies directly into log strings (use structured logger context instead)
Manually sanitize objects (e.g., with sanitizeObject) before using them in non-logging contexts or persisting metadata

Files:

  • src/app/src/pages/redteam/setup/components/Review.tsx
  • src/app/src/pages/redteam/setup/components/Strategies.tsx
  • src/app/src/pages/redteam/setup/components/Review.test.tsx
  • src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.test.tsx
  • src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.tsx
**/*.{test,spec}.{js,ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/gh-cli-workflow.mdc)

Avoid disabling or skipping tests unless absolutely necessary and documented

Files:

  • src/app/src/pages/redteam/setup/components/Review.test.tsx
  • src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.test.tsx
src/app/src/**/*.test.{ts,tsx}

📄 CodeRabbit inference engine (src/app/CLAUDE.md)

src/app/src/**/*.test.{ts,tsx}: In tests, import testing APIs from vitest (e.g., describe, it, expect, vi) and not from @jest/globals
Mock API calls in tests; do not make real network requests (e.g., vi.mock('@app/utils/api'))
Wrap MUI components in a ThemeProvider when testing components that depend on theme
Prefer Testing Library queries that reflect user behavior (getByRole, getByLabelText, etc.)
Clean up after tests as needed (e.g., clear/reset mocks) despite Vitest auto-cleanup
When testing React Router components, render under a MemoryRouter
Use Vitest as the test runner for the frontend app; do not use Jest APIs

Files:

  • src/app/src/pages/redteam/setup/components/Review.test.tsx
  • src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.test.tsx
🧠 Learnings (2)
📚 Learning: 2025-10-05T16:56:39.114Z
Learnt from: CR
PR: promptfoo/promptfoo#0
File: src/app/CLAUDE.md:0-0
Timestamp: 2025-10-05T16:56:39.114Z
Learning: Applies to src/app/src/**/*.test.{ts,tsx} : In tests, import testing APIs from vitest (e.g., describe, it, expect, vi) and not from jest/globals

Applied to files:

  • src/app/src/pages/redteam/setup/components/Review.test.tsx
📚 Learning: 2025-10-05T16:58:47.598Z
Learnt from: CR
PR: promptfoo/promptfoo#0
File: src/providers/CLAUDE.md:0-0
Timestamp: 2025-10-05T16:58:47.598Z
Learning: Applies to src/providers/test/providers/**/*.test.{ts,tsx,js,jsx} : Provider tests must cover rate limits, timeouts, and invalid configuration scenarios

Applied to files:

  • src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.test.tsx
🧬 Code graph analysis (5)
src/app/src/pages/redteam/setup/components/Review.tsx (1)
src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.tsx (1)
  • EstimatedProbesDisplay (22-58)
src/app/src/pages/redteam/setup/components/Strategies.tsx (1)
src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.tsx (1)
  • EstimatedProbesDisplay (22-58)
src/app/src/pages/redteam/setup/components/Review.test.tsx (1)
src/app/src/pages/redteam/setup/components/Review.tsx (1)
  • Review (75-1193)
src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.test.tsx (3)
src/app/src/pages/redteam/setup/components/strategies/utils.ts (1)
  • getEstimatedProbes (45-70)
src/app/src/pages/redteam/setup/types.ts (1)
  • Config (27-41)
src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.tsx (1)
  • EstimatedProbesDisplay (22-58)
src/app/src/pages/redteam/setup/components/EstimatedProbesDisplay.tsx (2)
src/app/src/pages/redteam/setup/types.ts (1)
  • Config (27-41)
src/app/src/pages/redteam/setup/components/strategies/utils.ts (1)
  • getEstimatedProbes (45-70)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Tusk Test Runner (3)
  • GitHub Check: Tusk Test Runner (2)
  • GitHub Check: Tusk Test Runner (4)
  • GitHub Check: Tusk Test Runner (1)
  • GitHub Check: Tusk Tester
🔇 Additional comments (3)
src/app/src/pages/redteam/setup/components/Review.tsx (1)

1056-1057: EstimatedProbesDisplay integration looks good

Placement, props, and styling parity with duration box are consistent.

src/app/src/pages/redteam/setup/components/Strategies.tsx (2)

439-442: EstimatedProbesDisplay placement is appropriate

Good abstraction and consistent UX with Review page.


28-29: getStrategyId import is valid
getStrategyId is exported in strategies/utils.ts, so the import resolves correctly.

Copy link
Contributor

use-tusk bot commented Oct 8, 2025

❌ Generated 8 tests - 7 passed, 1 failed (0dfdff3) View tests ↗

Test Summary

  • Review - 4 ✅
  • Strategies - 3 ✅, 1 ❌

Results

Tusk's tests for the estimated probes display feature are mostly solid, with 7 passing tests covering edge cases like large numbers, narrow viewports, and preset selection. There's one failing test in the Strategies component that reveals a critical issue: selecting a multi-turn preset shouldn't automatically apply stateful configuration, but it appears to be doing so. This could lead to unexpected behavior and resource usage for users configuring red team evaluations.

Key Points

  • ❌ The Strategies component incorrectly applies stateful configuration when multi-turn preset is selected, which could cause unintended test coverage and increased costs
  • The EstimatedProbesDisplay correctly handles edge cases like extremely large probe counts and zero probes when no strategies are selected
  • Tests verify that the UI correctly identifies and displays preset names (e.g., 'Quick') instead of 'Custom' when strategies match a predefined preset
  • Performance warning banner correctly appears at exactly 80% strategy selection threshold, which is important for warning users about evaluation time and cost
  • The component properly renders in narrow viewports, ensuring mobile/responsive compatibility

View check history

Commit Status Output Created (UTC)
6c3396e ⏩ Skipped due to new commit on branch Output Oct 8, 2025 8:27PM
3c80d81 ❌ Generated 8 tests - 7 passed, 1 failed Tests Oct 8, 2025 8:33PM
0dfdff3 ❌ Generated 8 tests - 7 passed, 1 failed Tests Oct 9, 2025 6:03PM

View output in GitHub ↗

Was Tusk helpful? Give feedback by reacting with 👍 or 👎

</Tooltip>
</Box>

<EstimatedProbesDisplay config={config} />
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will buy you lunch if you combine this with the estimated time banner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants