-
Notifications
You must be signed in to change notification settings - Fork 423
Refactor package types to use dedicated schema types per package type (NPM, PyPI, NuGet, OCI, MCPB) #634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
e64a442
to
b6bd3cc
Compare
Hey! |
@beshkenadze - hey 👋 Someone should probably correct me as I'm not that acquainted with the npm ecosystem, but Also this PR solely aim to refactor what is already there and it doesn't introduce any new package types. |
@rdimitrov thx for the clarification! Here’s the current implementation: {
"$schema": "https://static.modelcontextprotocol.io/schemas/2025-09-29/server.schema.json",
"packages": [
{
"registryType": "npm",
"registryBaseUrl": "https://registry.npmjs.org",
"identifier": "@modelcontextprotocol/server-brave-search",
"version": "1.0.2",
"fileSha256": "fe333e598595000ae021bd27117db32ec69af6987f507ba7a63c90638ff633ce",
"runtimeHint": "npx",
"transport": { "type": "stdio" }
}
]
} What if the package is hosted in a different registry (e.g., GitHub Packages) or executed directly in a TypeScript runtime such as Bun or Deno? In that case, would registryType: "npm" still make sense, or should we consider a more general value for non-NPM sources? Also — where would be the best place to discuss package type definitions more broadly (outside this PR)? Maybe in a separate issue or design doc? |
For this case I would expect
Yep, I think either Discord or a separate Discussion in GitHub would be best 👍 In any case, thanks for raising this 🙏 |
7f7bb21
to
3da63a0
Compare
}, | ||
"propertyName": "registryType" | ||
}, | ||
"oneOf": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this may be limiting - e.g. we need other package types that are not in this list, like "on_device"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, adding other package types is not in the scope of this PR 👍
// - For NPM/PyPI/NuGet: package name or ID | ||
// - For OCI: full image reference (e.g., "ghcr.io/owner/repo:v1.0.0") | ||
// - For MCPB: direct download URL | ||
Identifier string `json:"identifier" minLength:"1"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you preserve formatting so diffs are easier to review and we review only what's actually changed :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, can you elaborate more on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with changing OCI to just having the identifier field and combining everything there, but think we can do that with the current model.
I don't think we should split the package definitions up like this - I worry it makes it more likely that the definitions drift further from the implementation (because Golang does not support unions) and looking at it from a client perspective it looks like more models to handle. I think also it might make extending it with custom package types for subregistries harder? I haven't looked super closely, but think they'd then not be compliant with this spec from what I can see.
@domdomegg - I realised I forgot to update the title which might have confused you as this was my initial take, i.e. I started with an interface and different go structs for each type, but I quickly realised it's not good since it became too complicated. This approach is different. The Go type remains the same and the main change is on the validation part and on the schema so we can use it for another form of validation). Can you check again in case I've mislead you by thinking it's different go types whereas it's only on the schema (so we can ensure OCI doesn't have versions or npm has a file digest for example). Edit: Or I was wrong and you did refer to the schema suggesting we should keep it looser and don't imply the separation on its level? Apologies for going back and forth |
FYI: #645 |
Note: I've tested the PR against prod data earlier and the migration seemed to work fine -
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding my comments in Discord here --
Ok my view is that:
- I agree with @domdomegg's point that we don't want an explosion of package type structures in the schemas
- Though I'm not quite comfortable going to the other extreme of solely keeping this at the validation layer without any explicit in the OpenAPI/server.json schemas that this is the case
If we were to do (2), I think it'd look something like this --
Package:
type: object
required:
- registryType
- identifier
- transport
# version is NO LONGER globally required
properties:
registryType:
type: string
description: |
Registry type indicating how to download packages.
- npm, pypi, nuget: Use identifier + version + registryBaseUrl
- oci: Use identifier as full canonical reference (registry/namespace/image:tag)
- mcpb: Use identifier as download URL
enum: [npm, pypi, nuget, oci, mcpb]
identifier:
type: string
description: |
Package identifier. Format varies by registryType:
- npm/pypi/nuget: Package name (e.g., "@scope/package", "package-name")
- oci: Full canonical reference (e.g., "docker.io/user/image:v1.0.0", "ghcr.io/org/app:latest")
- mcpb: Direct HTTPS download URL
version:
type: string
description: |
Package version. Required for npm, pypi, nuget.
Ignored for oci (version in identifier) and mcpb (version in URL).
registryBaseUrl:
type: string
description: |
Registry base URL. Optional, with defaults:
- npm: defaults to https://registry.npmjs.org
- pypi: defaults to https://pypi.org
- nuget: defaults to https://api.nuget.org/v3/index.json
- oci: not used (registry in identifier)
- mcpb: not used
And I think that's not great because it leans so heavily on free form documentation for understanding expected shapes.
A third option, which is what I think I like - introduce "flavors" of types. I think we have three --
# Base package definition
PackageBase:
type: object
required: [registryType, identifier, transport]
properties:
registryType:
type: string
identifier:
type: string
transport:
# ... transport def
# ... common fields
# Registry-based packages (npm, pypi, nuget)
RegistryPackage:
allOf:
- $ref: '#/components/schemas/PackageBase'
- type: object
required: [version]
properties:
registryType:
enum: [npm, pypi, nuget]
version:
type: string
registryBaseUrl:
type: string
format: uri
# Image-based packages (OCI)
ImagePackage:
allOf:
- $ref: '#/components/schemas/PackageBase'
- type: object
properties:
registryType:
const: oci
identifier:
description: "Canonical OCI reference"
example: "docker.io/user/image:v1.0.0"
# Binary packages (MCPB)
BinaryPackage:
allOf:
- $ref: '#/components/schemas/PackageBase'
- type: object
required: [fileSha256]
properties:
registryType:
const: mcpb
fileSha256:
type: string
# Main Package type
Package:
oneOf:
- $ref: '#/components/schemas/RegistryPackage'
- $ref: '#/components/schemas/ImagePackage'
- $ref: '#/components/schemas/BinaryPackage'
discriminator:
propertyName: registryType
mapping:
npm: '#/components/schemas/RegistryPackage'
pypi: '#/components/schemas/RegistryPackage'
nuget: '#/components/schemas/RegistryPackage'
oci: '#/components/schemas/ImagePackage'
mcpb: '#/components/schemas/BinaryPackage'
Maybe we could keep the const
values looser / as suggestions so that anyone using the spec could use whatever registryTypes they want, we're just kicking it off with these 5 suggestions
This still gives structure while avoiding a per-registry-type explosion. What do you think?
Signed-off-by: Radoslav Dimitrov <[email protected]>
Signed-off-by: Radoslav Dimitrov <[email protected]>
…t pinning Signed-off-by: Radoslav Dimitrov <[email protected]>
Signed-off-by: Radoslav Dimitrov <[email protected]>
Signed-off-by: Radoslav Dimitrov <[email protected]>
Signed-off-by: Radoslav Dimitrov <[email protected]>
Signed-off-by: Radoslav Dimitrov <[email protected]>
Signed-off-by: Radoslav Dimitrov <[email protected]>
Signed-off-by: Radoslav Dimitrov <[email protected]>
Signed-off-by: Radoslav Dimitrov <[email protected]>
Signed-off-by: Radoslav Dimitrov <[email protected]>
Signed-off-by: Radoslav Dimitrov <[email protected]>
Signed-off-by: Radoslav Dimitrov <[email protected]>
Signed-off-by: Radoslav Dimitrov <[email protected]>
Signed-off-by: Radoslav Dimitrov <[email protected]>
Signed-off-by: Radoslav Dimitrov <[email protected]>
Signed-off-by: Radoslav Dimitrov <[email protected]>
Signed-off-by: Radoslav Dimitrov <[email protected]>
Signed-off-by: Radoslav Dimitrov <[email protected]>
e60bc80
to
b6a47f7
Compare
Signed-off-by: Radoslav Dimitrov <[email protected]>
Signed-off-by: Radoslav Dimitrov <[email protected]>
case model.RegistryTypeNPM: | ||
registryType = model.RegistryTypeNPM | ||
registryBaseURL = model.RegistryURLNPM | ||
pkg = model.Package{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[can be a follow up PR if you agree] I feel like we should use the "flavors" of Package types internally, and just map each of them to a "list of supported registry types" that can be expanded over time as support is added for more registry types. Instead of bespoke recreating each of the flavors on a per-type basis from Package in this switch statement.
docs/reference/api/openapi.yaml
Outdated
items: | ||
$ref: '#/components/schemas/KeyValueInput' | ||
|
||
RegistryPackage: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RegistryPackage
as a name feels a little overloaded. Maybe LanguagePackage
?
docs/reference/api/openapi.yaml
Outdated
properties: | ||
registryType: | ||
type: string | ||
const: mcpb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we drop this const in favor of an example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for the other consts on the other Package types
return ErrMissingVersionForPyPi | ||
} | ||
|
||
// Validate that MCPB-specific fields are not present |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably include a unit test on this logic branch (for npm, nuget as well). Unit test could be generically "does not allow fields that aren't explicitly in the LanguagePackage object"
Signed-off-by: Radoslav Dimitrov <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good with landing this to unblock OCI identifiers
Dismissing this so we can address the OpenAPI changes separately (there's a discussion in Discord about the right approach)
|
||
// Validate that MCPB-specific fields are not present | ||
if pkg.FileSHA256 != "" { | ||
return fmt.Errorf("NPM packages must not have 'fileSha256' field - this is only for MCPB packages") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor nit, but can probably drop this is only for MCPB packages
, in case we support other packages with this in future. I think error is clear enough without it.
// - registry/namespace/image:tag@digest | ||
// - namespace/image:tag (defaults to docker.io) | ||
// - image:tag (defaults to docker.io/library) | ||
func ParseOCIReference(ref string) (*OCIReference, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little scared about implementing this ourselves, but see there's lots of test coverage. I think I'd prefer to find a battle tested narrow library for this if there is one, but no biggie if not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am happy with this, thank you! My comments can be done as follow-ups / are just informational.
Motivation and Context
The following PR aims to refactor package types to have dedicated schema types per package type (NPM, PyPI, NuGet, OCI, MCPB).
Notes:
How Has This Been Tested?
Breaking Changes
Yes
Types of changes
Checklist
Additional context