Skip to content

Conversation

rdimitrov
Copy link
Member

@rdimitrov rdimitrov commented Oct 8, 2025

Motivation and Context

The following PR aims to refactor package types to have dedicated schema types per package type (NPM, PyPI, NuGet, OCI, MCPB).

  • The main benefit is through this we'll also have a single line OCI reference set via the identifier instead of splitting it into multiple properties.
  • Another thing that is changed is the MCPB package type doesn't need an explicit version property too (since this is already provided in the identifier (URL))
  • Adds a new migration that updates all existing OCI references to match this layout.

Notes:

  • Initially I've started by having a separate go struct per each type but turns out it's unnecessarily complicated to have multiple go structs so I'll use one while keeping the separation on the schema layer with proper validation that ensures the right properties are populated for the right package type.

How Has This Been Tested?

Breaking Changes

Yes

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

@rdimitrov rdimitrov force-pushed the refactor-package-types branch from e64a442 to b6bd3cc Compare October 8, 2025 12:09
@beshkenadze
Copy link

Hey!
Just wondering — why NPM specifically? The package could also come from bunx, deno, or even be a standalone binary, so maybe a more general runtime type would fit better.

@rdimitrov
Copy link
Member Author

Hey! Just wondering — why NPM specifically? The package could also come from bunx, deno, or even be a standalone binary, so maybe a more general runtime type would fit better.

@beshkenadze - hey 👋

Someone should probably correct me as I'm not that acquainted with the npm ecosystem, but npm in our context is the registry (the location where the package is published). From my understanding npx, bunx, deno and so forth are runtimes that fetch & run npm packages from that registry (by default).

Also this PR solely aim to refactor what is already there and it doesn't introduce any new package types.

@beshkenadze
Copy link

@rdimitrov thx for the clarification!

Here’s the current implementation:

{
  "$schema": "https://static.modelcontextprotocol.io/schemas/2025-09-29/server.schema.json",
  "packages": [
    {
      "registryType": "npm",
      "registryBaseUrl": "https://registry.npmjs.org",
      "identifier": "@modelcontextprotocol/server-brave-search",
      "version": "1.0.2",
      "fileSha256": "fe333e598595000ae021bd27117db32ec69af6987f507ba7a63c90638ff633ce",
      "runtimeHint": "npx",
      "transport": { "type": "stdio" }
    }
  ]
}

What if the package is hosted in a different registry (e.g., GitHub Packages) or executed directly in a TypeScript runtime such as Bun or Deno?

In that case, would registryType: "npm" still make sense, or should we consider a more general value for non-NPM sources?

Also — where would be the best place to discuss package type definitions more broadly (outside this PR)? Maybe in a separate issue or design doc?

@rdimitrov
Copy link
Member Author

What if the package is hosted in a different registry (e.g., GitHub Packages) or executed directly in a TypeScript runtime such as Bun or Deno?

For this case I would expect registryBaseUrl has npm.pkg.github.com instead of https://registry.npmjs.org, right?

Also — where would be the best place to discuss package type definitions more broadly (outside this PR)? Maybe in a separate issue or design doc?

Yep, I think either Discord or a separate Discussion in GitHub would be best 👍 In any case, thanks for raising this 🙏

@rdimitrov rdimitrov force-pushed the refactor-package-types branch 4 times, most recently from 7f7bb21 to 3da63a0 Compare October 8, 2025 21:41
},
"propertyName": "registryType"
},
"oneOf": [
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this may be limiting - e.g. we need other package types that are not in this list, like "on_device"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, adding other package types is not in the scope of this PR 👍

// - For NPM/PyPI/NuGet: package name or ID
// - For OCI: full image reference (e.g., "ghcr.io/owner/repo:v1.0.0")
// - For MCPB: direct download URL
Identifier string `json:"identifier" minLength:"1"`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you preserve formatting so diffs are easier to review and we review only what's actually changed :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, can you elaborate more on this?

Copy link
Member

@domdomegg domdomegg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with changing OCI to just having the identifier field and combining everything there, but think we can do that with the current model.

I don't think we should split the package definitions up like this - I worry it makes it more likely that the definitions drift further from the implementation (because Golang does not support unions) and looking at it from a client perspective it looks like more models to handle. I think also it might make extending it with custom package types for subregistries harder? I haven't looked super closely, but think they'd then not be compliant with this spec from what I can see.

@rdimitrov rdimitrov changed the title Draft: Refactor package types to use dedicated structs per package type (NPM, PyPI, NuGet, OCI, MCPB) Draft: Refactor package types to use dedicated schema types per package type (NPM, PyPI, NuGet, OCI, MCPB) Oct 9, 2025
@rdimitrov
Copy link
Member Author

rdimitrov commented Oct 9, 2025

I agree with changing OCI to just having the identifier field and combining everything there, but think we can do that with the current model.

I don't think we should split the package definitions up like this - I worry it makes it more likely that the definitions drift further from the implementation (because Golang does not support unions) and looking at it from a client perspective it looks like more models to handle. I think also it might make extending it with custom package types for subregistries harder? I haven't looked super closely, but think they'd then not be compliant with this spec from what I can see.

@domdomegg - I realised I forgot to update the title which might have confused you as this was my initial take, i.e. I started with an interface and different go structs for each type, but I quickly realised it's not good since it became too complicated.

This approach is different. The Go type remains the same and the main change is on the validation part and on the schema so we can use it for another form of validation). Can you check again in case I've mislead you by thinking it's different go types whereas it's only on the schema (so we can ensure OCI doesn't have versions or npm has a file digest for example).

Edit: Or I was wrong and you did refer to the schema suggesting we should keep it looser and don't imply the separation on its level? Apologies for going back and forth

@rdimitrov rdimitrov requested a review from domdomegg October 9, 2025 16:28
@beshkenadze
Copy link

What if the package is hosted in a different registry (e.g., GitHub Packages) or executed directly in a TypeScript runtime such as Bun or Deno?

For this case I would expect registryBaseUrl has npm.pkg.github.com instead of https://registry.npmjs.org, right?

Also — where would be the best place to discuss package type definitions more broadly (outside this PR)? Maybe in a separate issue or design doc?

Yep, I think either Discord or a separate Discussion in GitHub would be best 👍 In any case, thanks for raising this 🙏

FYI: #645

@rdimitrov
Copy link
Member Author

Note: I've tested the PR against prod data earlier and the migration seemed to work fine -

⏺ Migration Testing Complete ✅

Successfully tested migration 009 with real production data (1080 servers). Here's the validation summary:

Migration Results:

- Total servers: 1,080
- Servers with packages: 599 (updated by migration)
- Package breakdown:
  - NPM: 420 packages
  - PyPI: 138 packages
  - OCI: 75 packages
  - MCPB: 46 packages
  - NuGet: 20 packages

Validation Checks (All Passed):

1. ✅ All packages have transport field: 0 packages missing transport
2. ✅ OCI canonical format: 0 OCI packages with old version or registryBaseUrl fields
3. ✅ MCPB cleanup: 0 MCPB packages with redundant version or registryBaseUrl fields
4. ✅ Forbidden field removal: 0 non-MCPB packages with fileSha256 field
5. ✅ NPM/PyPI/NuGet unchanged: Still have version and registryBaseUrl as expected

Sample Validation:

- OCI packages: Correctly formatted with canonical references like ghcr.io/victoriametrics-community/mcp-victoriametrics:v1.13.1
- MCPB packages: Retained fileSha256, removed redundant fields
- NPM packages: Unchanged with version and registryBaseUrl preserved

The migration is working correctly on production data!

@rdimitrov rdimitrov changed the title Draft: Refactor package types to use dedicated schema types per package type (NPM, PyPI, NuGet, OCI, MCPB) Refactor package types to use dedicated schema types per package type (NPM, PyPI, NuGet, OCI, MCPB) Oct 9, 2025
Copy link
Member

@tadasant tadasant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding my comments in Discord here --

Ok my view is that:

  1. I agree with @domdomegg's point that we don't want an explosion of package type structures in the schemas
  2. Though I'm not quite comfortable going to the other extreme of solely keeping this at the validation layer without any explicit in the OpenAPI/server.json schemas that this is the case

If we were to do (2), I think it'd look something like this --

  Package:
    type: object
    required:
      - registryType
      - identifier
      - transport
      # version is NO LONGER globally required
    properties:
      registryType:
        type: string
        description: |
          Registry type indicating how to download packages.
          - npm, pypi, nuget: Use identifier + version + registryBaseUrl
          - oci: Use identifier as full canonical reference (registry/namespace/image:tag)
          - mcpb: Use identifier as download URL
        enum: [npm, pypi, nuget, oci, mcpb]

      identifier:
        type: string
        description: |
          Package identifier. Format varies by registryType:
          - npm/pypi/nuget: Package name (e.g., "@scope/package", "package-name")
          - oci: Full canonical reference (e.g., "docker.io/user/image:v1.0.0", "ghcr.io/org/app:latest")
          - mcpb: Direct HTTPS download URL

      version:
        type: string
        description: |
          Package version. Required for npm, pypi, nuget.
          Ignored for oci (version in identifier) and mcpb (version in URL).

      registryBaseUrl:
        type: string
        description: |
          Registry base URL. Optional, with defaults:
          - npm: defaults to https://registry.npmjs.org
          - pypi: defaults to https://pypi.org
          - nuget: defaults to https://api.nuget.org/v3/index.json
          - oci: not used (registry in identifier)
          - mcpb: not used

And I think that's not great because it leans so heavily on free form documentation for understanding expected shapes.

A third option, which is what I think I like - introduce "flavors" of types. I think we have three --

# Base package definition
  PackageBase:
    type: object
    required: [registryType, identifier, transport]
    properties:
      registryType:
        type: string
      identifier:
        type: string
      transport:
        # ... transport def
      # ... common fields

  # Registry-based packages (npm, pypi, nuget)
  RegistryPackage:
    allOf:
      - $ref: '#/components/schemas/PackageBase'
      - type: object
        required: [version]
        properties:
          registryType:
            enum: [npm, pypi, nuget]
          version:
            type: string
          registryBaseUrl:
            type: string
            format: uri

  # Image-based packages (OCI)
  ImagePackage:
    allOf:
      - $ref: '#/components/schemas/PackageBase'
      - type: object
        properties:
          registryType:
            const: oci
          identifier:
            description: "Canonical OCI reference"
            example: "docker.io/user/image:v1.0.0"

  # Binary packages (MCPB)
  BinaryPackage:
    allOf:
      - $ref: '#/components/schemas/PackageBase'
      - type: object
        required: [fileSha256]
        properties:
          registryType:
            const: mcpb
          fileSha256:
            type: string

  # Main Package type
  Package:
    oneOf:
      - $ref: '#/components/schemas/RegistryPackage'
      - $ref: '#/components/schemas/ImagePackage'
      - $ref: '#/components/schemas/BinaryPackage'
    discriminator:
      propertyName: registryType
      mapping:
        npm: '#/components/schemas/RegistryPackage'
        pypi: '#/components/schemas/RegistryPackage'
        nuget: '#/components/schemas/RegistryPackage'
        oci: '#/components/schemas/ImagePackage'
        mcpb: '#/components/schemas/BinaryPackage'

Maybe we could keep the const values looser / as suggestions so that anyone using the spec could use whatever registryTypes they want, we're just kicking it off with these 5 suggestions

This still gives structure while avoiding a per-registry-type explosion. What do you think?

@rdimitrov rdimitrov force-pushed the refactor-package-types branch from e60bc80 to b6a47f7 Compare October 10, 2025 16:07
case model.RegistryTypeNPM:
registryType = model.RegistryTypeNPM
registryBaseURL = model.RegistryURLNPM
pkg = model.Package{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[can be a follow up PR if you agree] I feel like we should use the "flavors" of Package types internally, and just map each of them to a "list of supported registry types" that can be expanded over time as support is added for more registry types. Instead of bespoke recreating each of the flavors on a per-type basis from Package in this switch statement.

items:
$ref: '#/components/schemas/KeyValueInput'

RegistryPackage:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RegistryPackage as a name feels a little overloaded. Maybe LanguagePackage?

properties:
registryType:
type: string
const: mcpb
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we drop this const in favor of an example?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for the other consts on the other Package types

return ErrMissingVersionForPyPi
}

// Validate that MCPB-specific fields are not present
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably include a unit test on this logic branch (for npm, nuget as well). Unit test could be generically "does not allow fields that aren't explicitly in the LanguagePackage object"

Signed-off-by: Radoslav Dimitrov <[email protected]>
Copy link
Member

@tadasant tadasant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good with landing this to unblock OCI identifiers

@rdimitrov rdimitrov dismissed domdomegg’s stale review October 10, 2025 21:33

Dismissing this so we can address the OpenAPI changes separately (there's a discussion in Discord about the right approach)


// Validate that MCPB-specific fields are not present
if pkg.FileSHA256 != "" {
return fmt.Errorf("NPM packages must not have 'fileSha256' field - this is only for MCPB packages")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit, but can probably drop this is only for MCPB packages, in case we support other packages with this in future. I think error is clear enough without it.

// - registry/namespace/image:tag@digest
// - namespace/image:tag (defaults to docker.io)
// - image:tag (defaults to docker.io/library)
func ParseOCIReference(ref string) (*OCIReference, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little scared about implementing this ourselves, but see there's lots of test coverage. I think I'd prefer to find a battle tested narrow library for this if there is one, but no biggie if not.

Copy link
Member

@domdomegg domdomegg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am happy with this, thank you! My comments can be done as follow-ups / are just informational.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants