-
Notifications
You must be signed in to change notification settings - Fork 690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support different levels validation for JSON reader stream #452
Conversation
schema/validator.go
Outdated
if strict { | ||
return fmt.Errorf("error: manifest has an unknown media type: %s\n", header.MediaType) | ||
} | ||
fmt.Printf("warning: manifest has an unknown media type: %s\n", header.MediaType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In #403 I avoided repeating this string by setting up the error (once) and then either warning about it or returning it depending on strict
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in
6934d41
3da0cd8
a2d0b36
I code it as,
msg := fmt.Sprintf(...)
if strict {
return fmt.Errorf("error: %s", msg)
}
fmt.Printf("warning: %s\n", msg)
I honestly like fmt.Sprintf
to generate message, because it may be error or warning type.
fmt.Errorf
and fmt.Printf
can distinguish returning or omitting behavior.
schema/validator.go
Outdated
if header.MediaType != string(v1.MediaTypeImageManifestList) { | ||
msg := fmt.Sprintf("manifest list has an unknown media type: %s", header.MediaType) | ||
if strict { | ||
return fmt.Errorf("error: %s", msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with the “warning: ” prefix you have below, but I don't think you need an “error: ” prefix for the error string. You know it's an error because its Go type is error
, having “error” in the string payload is redundant. If you want to display these to users with an error prefix (which is fine with me), then prefix the error with “error: ” when you print it to stderr (or wherever). But don't add the prefix here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
schema/validator.go
Outdated
if strict { | ||
return fmt.Errorf("error: manifest %s has an unknown media type: %s\n", manifest.Digest, manifest.MediaType) | ||
} | ||
fmt.Printf("warning: manifest %s has an unknown media type: %s\n", manifest.Digest, manifest.MediaType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can do the same Sprintf
(or whatever) unification here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it is a nit, fix in 25825f3#diff-4a672c41642ea30ebf88bb87685cf72aR119
schema/manifestlist_test.go
Outdated
@@ -0,0 +1,106 @@ | |||
// Copyright 2016 The Linux Foundation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it will conflict with your in-flight #444 (also adding a schema/manifestlist_test.go
from scratch).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. I would resolve the conflict after any PR landed firstly.
The test cases in these PRs are for different purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked into #444 again, but I think manifestlist_test.go
needs a success test with OCI-defined mediaType
s for both the root mediaType
and the manifests[0].mediaType
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall OK but needs docs and removal of random printing.
schema/validator.go
Outdated
@@ -46,17 +47,17 @@ func (e ValidationError) Error() string { | |||
} | |||
|
|||
// Validate validates the given reader against the schema of the wrapped media type. | |||
func (v Validator) Validate(src io.Reader) error { | |||
func (v Validator) Validate(src io.Reader, strict bool) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please describe what strict does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I add doc in d9b5499#diff-4a672c41642ea30ebf88bb87685cf72aR49. PTAL.
schema/validator.go
Outdated
if strict { | ||
return fmt.Errorf("%s", msg) | ||
} | ||
fmt.Printf("warning: %s\n", msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
erm, I don't think this package should just randomly printf stuff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@philips , here I keep previous non-strict behavior(warning) unchanged. And I think it seems had better to warn customer his media-type is not OCI format, even be accepted.
Or do you have some proposal for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should stick to printing warnings for this PR.
In follow-up work we can add a validation argument which can hold a test-harness handle (e.g. a more generic form of the harness *tap.T
I mocked up in opencontainers/image-tools@4f0b3c1b). Then these test functions can register errors and warnings without worrying about how the results are being collected or displayed. But we don't need to bite all of that off in one chunk ;).
schema/validator.go
Outdated
@@ -46,17 +47,18 @@ func (e ValidationError) Error() string { | |||
} | |||
|
|||
// Validate validates the given reader against the schema of the wrapped media type. | |||
func (v Validator) Validate(src io.Reader) error { | |||
// strict: accept the customized media type when value is false, or reject when true. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think better docs for strict
are “require the validated object to only contain content which the OCI defines (or references) in the spec”. More notes on assorted shades of validation strictness in opencontainers/image-tools#66.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wking I think your words may be good as help message for customer, but I really want to present the design purpose here, as my words.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What design purpose are my words missing? This comment is going to be in user-facing godocs, so I think we need to cover both audiences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
schema/manifestlist_test.go
Outdated
manifestList: ` | ||
{ | ||
"schemaVersion": 2, | ||
"mediaType": "application/customized+json", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
schema/manifestlist_test.go
Outdated
manifestList: ` | ||
{ | ||
"schemaVersion": 2, | ||
"mediaType": "application/customized+json", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will fail regardless of strict (since #404). I'm fine leaving a test for that in place (with fail: true
), but the test should use strict: false
to show that the failure is due to you not passing a manifest-list, and not due to particular validation strictness.
@opencontainers/image-spec-maintainers @wking , |
schema/manifest_test.go
Outdated
"size": 1470, | ||
"digest": "sha256:c86f7763873b6c0aae22d963bab59b4f5debbed6685761b5951584f6efb0633b" | ||
}, | ||
"layers": [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is currently illegal (although possibly not validated). See #407, which is trying to make it legal. For the purpose of this test, you should use the usual strictly-valid dummy layer from the other tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wking , but current schema really allows empty layer slice(I've debug it). I think spec also allow empty. I honest dislike to avoid unclear of undetermined stuff in specific patch.
To add more test case for optional layers is another story(IMO).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Thu, Nov 17, 2016 at 06:00:35PM -0800, xiekeyang wrote:
… but current schema really allows empty layer slice (I've debug it). I think spec also allow empty…
I don't think the spec allows empty layers, based on:
The array MUST have the base image at index 0.
#407, and #318 before it, are attempting to lift that restriction. But I agree that you don't want to get involved in that discussion in this PR. By using the usual single-entry layers
array here, you would avoid getting involved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wking You are right. Excuse me to misunderstood the previous discussion. I fix it in a183979#diff-6493def706b32045f8d30e7a4b860c46R158.
What you mentioned seems a bug, which I would fix following up.
schema/spec_test.go
Outdated
@@ -73,7 +73,7 @@ func validate(t *testing.T, name string) { | |||
continue | |||
} | |||
|
|||
err = schema.Validator(example.Mediatype).Validate(strings.NewReader(example.Body)) | |||
err = schema.Validator(example.Mediatype).Validate(strings.NewReader(example.Body), false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like we should be able to validate spec examples strictly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right. This test is for spec validation, which MUST be strict OCI format coincidence. false
will lead this test case loss effectiveness. fix in b68d692#diff-5b73117945d63fe6253dccb9c210d74fR76.
The diff from both b68d692 and a183979 looks fine to me, although I
think we might want to squash the commits together. As it stands
running the test suite on just b68d692 raises:
$ make test
go test -race -cover github.com/opencontainers/image-spec/schema github.com/opencontainers/image-spec/specs-go github.com/opencontainers/image-spec/specs-go/v1
# github.com/opencontainers/image-spec/schema_test
schema/manifest_test.go:118: not enough arguments in call to schema.MediaTypeManifest.Validate
FAIL github.com/opencontainers/image-spec/schema [build failed]
…
|
squashed in 833391c |
833391c looks good to me :).
|
How can we define a validation framework that can support different levels of validation? This PR sub-divides the validation space into media types and not media types. This creates the problem that we now have to define what belongs in strict and what doesn't. Wouldn't it make sense if we have options to control the suites of validation? For example, to get the "strict" behavior, one might configure this as: Validate(r, ValidateFields, ValidateMediaTypes) We could also have explicit sets of validation that meet certain support-levels: var ValidateV1Strict []Validator{ValidateFields, ValidateMediaTypes}
Validate(r, ValidateV2Strict...) Without a definition of what "strict" means, we'll be in the position that this will be under definition of the implementation, where if we are clear what is being validated, we can assure users they are in line with the specification. |
On Mon, Nov 21, 2016 at 01:24:03PM -0800, Stephen Day wrote:
There's a definition here 1. I think we can tighten that defintion
I'm fine with this sort of breakdown, but if we want to support it I
And it seems orthogonal to how strictly we do each of those |
@wking Please let the submitter respond to code review comments. Again, you linked me to something without any context and I have no clue what you mean. |
@stevvooe I mostly understand and agree your expectation. Let me refactor the PR and re-submit sooner. |
@stevvooe
Customized json is failed under strict mode, and prompt warning non-strict mode. I think your proposal want tool to be more flexible, like,
And you want to involve all possible objects(in future) to However, I still have a question. You said,
Actually I think |
I refactor patch according to @stevvooe proposal of level usage and function interface. It is done in https://github.com/xiekeyang/image-spec/blob/6b47a182a3b6499df2bead20016d6e4eec7fdb4e/schema/validator.go. |
|
||
for _, manifest := range header.Manifests { | ||
if manifest.MediaType != string(v1.MediaTypeImageManifest) { | ||
return fmt.Errorf("manifest %s has an unknown media type: %s", manifest.Digest, manifest.MediaType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach loses the strict distinction you had before. I don't see a way to invoke this to error out when a media type is truly invalid (i.e. not compliant with RFC 6838, #448) vs. not necessarily supported by an OCI-compliant implementation (e.g. a non-OCI type). I think we need to restore the strict option you had before so we can make distinctions like that (more on strictness in opencontainers/image-tools#66).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wking
After former proposal from @stevvooe, I think we have made agreement. Current commit follows up the proposal thought, to use function slice. Consumer can select objects(fields, media type...) at will.
I don't see a way to invoke this to error out when a media type is truly invalid (i.e. not compliant with RFC 6838, #448) vs. not necessarily supported by an OCI-compliant implementation (e.g. a non-OCI type).
Under truly invalid case, JSON schema prompts error.
Under non-OCI type, validator will prompt error only if ref-type(here is media type) function is selected. Or it is validated successfully.
Honestly I'm not very willing roll back the former strict framework, which has been rejected. And we had better to wait @stevvooe's feedback.
cc @stevvooe WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under truly invalid case, JSON schema prompts error.
This is not always true. For example, the hostname / UTS-namespace requirement (a case of the general namespace no-tweaking requirement) is not covered by the JSON Schema.
Honestly I'm not very willing roll back the former strict framework, which has been rejected. And we had better to wait @stevvooe's feedback.
Yeah, better to not change anything until we know what the maintainers are looking for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And to give a JSON Schema limitation for this spec, we don't currently check to see that descriptors don't use the reserved data
. And I'm not sure how you'd phrase that in JSON Schema since we have to allow other unrecognized properties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And to give a JSON Schema limitation for this spec, we don't currently check to see that descriptors don't use the reserved data. And I'm not sure how you'd phrase that in JSON Schema since we have to allow other unrecognized properties.
It didn't, and I don't think it belongs to this patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it belongs to this patch.
I agree that we don't want to cover the reserved data
as part of this PR. This PR is about two things:
- Broadly, setting a precedent for handling verification for behaviour which images MAY use but which implementations MUST support.
- Narrowly, adding optional validation for per-location media types, which are a special case of the broader issue.
So while I have no desire to expand the narrow target, I want to try and solve it in a way that will cover the broader issue. I think the strict
option sets a precedent which will scale well as we extend it to other spec statements.
I'm less clear on how the “pass in an array of validator functions” will scale. It seems like it might end up with a separate validator for each spec statement? Or maybe we'll end up with a ValidateOnlyContainsOCIRequiredSettings
validator? Or…? It's not clear to me how we're expected to apply this approach to future spec statements.
@stevvooe If current patch looks as what you want? |
This SGTM. Though @stevvooe recommends this needs a technical proposal on how this kind of feature ties to the requirements of compliance and certification |
@vbatts We can go forward with this approach until we have better requirements. |
@stevvooe I think you agree the current implementation of pr, to present selectable media validating function, which is controlled by function-map. In future we can add more val function for another objects if necessary. Is that right? |
@stevvooe I'm considering to use interface for |
Is this needing to block a vote for RC5? |
Or we can consider to move this milestone to next release. |
@opencontainers/image-spec-maintainers I update this, to keep functional map, and add a bug fix commit. PTAL. |
image-layout.md
Outdated
@@ -134,7 +134,7 @@ The `imageLayoutVersion` value will align with the OCI Image Specification versi | |||
|
|||
### oci-layout Example | |||
|
|||
```json | |||
```json,title=Image%20Layout&valid=enabling&mediatype=oci%2Dlayout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in favor of adding a title, but “Image Layout” is the whole structure, not just the oci-layout
file. Can we follow the section headers and call this “oci-layout file” or some such?
What is valid=enabling
about?
I don't think we want to invent a one-off media type here. If we want a media type for this file format (e.g. for example validation), I think we should declare a media type for the schema like we do with our other schemas (e.g. here and here for manifests).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, valid=enabling
is a redundant debugging item, should be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I extract this and put it on #579.
image-layout.md
Outdated
@@ -162,7 +162,7 @@ Those tags will often be represented in an image-layout repository with matching | |||
|
|||
### Index Example | |||
|
|||
```json,title=Image%20Index&mediatype=application/vnd.oci.image.index.v1%2Bjson | |||
```json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we still want to validate this example as an index even if it doesn't occur in image-index.md
. And titles seem like they'd always be a good idea. Why are you removing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remove this title as not to valid it. I understand that you want to valid it.
But I think spec_test.go
should only valid those who are announced only in itself *.md. So I remove it, for it is just a reference. It is IMO, to be not sure.
schema/validator.go
Outdated
@@ -26,14 +26,42 @@ import ( | |||
"github.com/xeipuuv/gojsonschema" | |||
) | |||
|
|||
type validateRefMediaFunc func(r io.Reader) error | |||
|
|||
var vlidateRefMediaMap = map[Validator]validateRefMediaFunc{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vlidateRefMediaMap
→ validateRefMediaMap
.
} | ||
|
||
// ValidateFunc provides the type of function to validate specific object. | ||
type ValidateFunc func(r io.Reader, v Validator) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we expect to be applying a series of these to validate a single blob, using an io.Reader
seems wasteful. Is there some Go type magic we can use to pass a parsed JSON object through to the validator to let us only deserialize the JSON once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't find out the better way yet. At the same time, I still think it had better to input io reader stream to. ValidateRefFuncs are public functions. Consumer should be allowed called them directly. Here we provide io reader to customer, who can use them freely, and it allows expend their job in future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ValidateRefFuncs are public functions.
That doesn't really matter. We could always provide io.Reader
and parsed-JSON flavored versions if we figure out how to write a function that takes an already-parsed-from-JSON object.
schema/validator.go
Outdated
// Validator wraps a media type string identifier | ||
// and implements validation against a JSON schema. | ||
type Validator string | ||
|
||
type validateDescendantsFunc func(r io.Reader) error | ||
// Validate validates the given reader against the schema and OCI format defined in spec. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see anything about “the schema” or “OCI format defined in the spec” in your current implementation (3ad5d1b). This function is just “read r
into memory and call all of funcList
on the content, dying on the first error”.
schema/validator.go
Outdated
buf, err := ioutil.ReadAll(r) | ||
if err != nil { | ||
return errors.Wrap(err, "unable to read the document file") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to bother with this, since you're only calling a single f
. Just pass your reader along:
if ok {
if f == nil {
…
}
return f(r)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this should be fixed.
schema/validator.go
Outdated
return f(bytes.NewReader(buf)) | ||
} | ||
|
||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should be raising the same unimplemented error here as you currently have for f == nil
. How about:
f, ok := vlidateRefMediaMap[v]
if ok && f != nil {
return f(r)
}
return fmt.Errorf("referenced media type validation %q unimplemented", v)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm hard to agree for this. Some validations such as descriptor, layout doesn't exist in function map, which make no sense. They should be ignore, but not prompt error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some validations such as descriptor, layout doesn't exist in function map, which make no sense.
We can validate descriptors against the JSON Schema. And if you want JSON Schema validation handled elsewhere, we can define a no-op validator for types where we know there are no children. But I think silently ignoring validating for all types that have no validator registered is setting ourselves up for “oops, that wasn't valid, but we forgot to add a validator for $TYPE”.
This establishes function list, to allow consumer to select items for validation. A based validation level is JSON schema, and a optional item is the media-type of their referenced fields. Signed-off-by: xiekeyang <[email protected]>
@stevvooe The stuff under discussion is that the argument variable for ValidRefFuncs should be io.Reader type or json type. I prefer to reader. WDYT? |
I'm so regretful that I've make big effort on this, but it can't attract maintainers attention. Although I still think it really bring benefit according to pre discussion, I've to close it. |
As previous discussion among developers, to use selectable strict mode, for validating media type of referenced objects. After enabling strict media types unmatched to OCI will be expected failure.
Signed-off-by: xiekeyang [email protected]