Skip to content

Conversation

devin-ai-integration[bot]
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Sep 23, 2025

Fix metadata field types in OpenAPI schemas (ENG-3620)

Summary

Updates the OpenAPI schema definitions in both v1 and v2 API documentation to accurately reflect the actual types of metadata fields returned by the API. The scope was expanded based on PR feedback to include ALL HTML-extracted metadata fields, not just the originally planned keywords and ogLocaleAlternate.

Key Changes:

  • HTML-extracted fields now use oneOf construct to support both string and string[] types:
    • title, description, language, keywords, and the generic <any other metadata> pattern
  • ogLocaleAlternate remains explicitly typed as string[] (array only)
  • Non-HTML fields unchanged: url, sourceURL, statusCode, error, cacheState, cachedAt, numPages
  • Applied consistently across all metadata schema locations in both OpenAPI files (8+ locations per file)

This aligns the documentation with actual SDK implementations:

  • Python SDK: Optional[Union[str, List[str]]]
  • JavaScript SDK: string | string[]

Review & Testing Checklist for Human

🔴 High Risk - 4 items to verify:

  • Schema validation with real API responses - Test actual API calls return responses that validate against the updated schemas (the JSON syntax passes but real response compatibility is untested)
  • Completeness of field classification - Verify the distinction between HTML-extracted vs system fields is correct (I interpreted this from the PR comment but may have misclassified some fields)
  • Consistency across all schema locations - Check that all 8+ metadata schema definitions in each file were updated consistently (manual editing across many locations has high error risk)
  • ogLocaleAlternate typing - Confirm this field should remain string[] only and not oneOf like other HTML-extracted fields

Test Plan Recommendation

  1. Make API calls that return metadata with both string and array values for HTML-extracted fields
  2. Validate the responses against the updated OpenAPI schemas using a schema validator
  3. Test edge cases with empty arrays, null values, and mixed field types in the same response
  4. Verify the documentation renders correctly in API docs (Mintlify preview available)

Notes

  • Scope expanded significantly during development from 2 fields to all HTML-extracted fields based on @mogery's feedback
  • Both v1 and v2 schemas updated with 269 total insertions across multiple response schema locations
  • JSON syntax validation passed but real-world API compatibility testing is needed

Session Details:
Requested by @mogery | Session: https://app.devin.ai/sessions/e04da2c10b8c4db3b2f22ad23effb3ac

- Add oneOf construct for keywords field to support both string and string array types
- Define ogLocaleAlternate as explicit array type
- Update all metadata schema locations in both v1 and v2 OpenAPI files
- Maintain backward compatibility while accurately documenting API behavior
- Align with Python SDK (Union[str, List[str]]) and JavaScript SDK (string | string[]) implementations

Co-Authored-By: mogery <[email protected]>
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Copy link
Member

@mogery mogery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All fields that are extracted from HTML (so not url, sourceURL, statusCode, error, cacheState, cachedAt, numPages) can be a string or an array

Copy link
Contributor Author

Thank you for the clarification! You're absolutely right - I need to update ALL HTML-extracted metadata fields to support both string and array types, not just keywords and ogLocaleAlternate.

I'll update the OpenAPI schemas to apply the oneOf construct to all fields extracted from HTML (excluding url, sourceURL, statusCode, error, cacheState, cachedAt, numPages as you specified).

This includes fields like:

  • title
  • description
  • language
  • keywords
  • robots
  • ogTitle
  • ogDescription
  • ogUrl
  • ogImage
  • ogLocaleAlternate (array only)
  • ogSiteName
  • And any other HTML-extracted metadata fields

I'll push the updated changes to this PR shortly.

…r ENG-3620

- Update ALL HTML-extracted metadata fields (title, description, language, keywords, etc.) to use oneOf construct
- Support both string and array types for all HTML-extracted fields except excluded ones
- Exclude non-HTML fields: url, sourceURL, statusCode, error, cacheState, cachedAt, numPages
- Update generic '<any other metadata>' pattern to use oneOf construct
- Address PR feedback from @mogery about expanding scope beyond just keywords and ogLocaleAlternate
- Maintain ogLocaleAlternate as explicit array type
- Apply changes consistently across all metadata schema locations in both v1 and v2 OpenAPI files

Co-Authored-By: mogery <[email protected]>
@mogery mogery merged commit 05a3a25 into main Sep 28, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant