-
Notifications
You must be signed in to change notification settings - Fork 358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: property is passed as integer and cannot be accessed #784
base: master
Are you sure you want to change the base?
Conversation
Although your help is appreciated, I’m reluctant to merge these changes without a proper explanation. Some reproduction steps at minimum and the version(s) affected would be a great first step. |
As of PHP 7.x, if you So in my honest opinion it is absolutely safe to cast them to string for the sake of See https://www.php.net/manual/de/function.json-decode.php#126787 |
That makes sense. I hope to find some time to add a test with the simplified object+json schema to reproduce the case. Last version, including dependencies are used. I've discovered it today after bumping v6.0.0.0 -> v6.1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what it's worth, the error that the library is reporting is correct. There is no such thing as a JSON property for which the name is not a string; that type is a requirement of the JSON spec. And JSON also does not support associative arrays, or arrays with holes.
Why would we modify this library to suppress that error, when it's correctly pointing out a problem with your input data? IMO, this is a user error, and the library behaviour is already correct.
The only exception would be if the cast was done proactively, as part of the type coercion feature. If you really do want to have the library cast this, then it should be implemented behind that flag as part of a more general assoc array -> object transformation.
I managed to add an example in a test for my simplified case. The test passes with the fix and does not without. |
return [ | ||
[ | ||
'{"some_property":{"some_sub_property":{"4":10,"2":10}}}', | ||
'{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test seems needlessly complicated - what is it that you are actually intending to test? Tests should test one thing, clearly. If there is a need to test multiple things, there should be a clear test for each thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an example from my case. As I mentioned before, unfortunately, I can not look into that more closely and trace the root cause at the moment.
Hope to come back to this later or maybe someone else jumps into the same problem and finds time to dig it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it stands currently, it's not an appropriate test. Tests need to have a clear purpose; they can't just be a paste of "this is a piece of my project that I think should pass validation".
I can not look into that more closely and trace the root cause at the moment.
If this relates to the error reported in your original comment, the root cause is as the error says: you are attempting to use an integer as a JSON property name, which is in violation of the spec. You can't do that.
JSON has no concept of arrays that use arbitrary keys (i.e. associative arrays). You cannot legitimately use an array with a non-zero basis, or with index holes, because that is by definition an associative array. If you want to validate it, you'll need to turn it into a JSON-compliant object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it stands currently, it's not an appropriate test. Tests need to have a clear purpose; they can't just be a paste of "this is a piece of my project that I think should pass validation".
That's right.
you are attempting to use an integer as a JSON property name, which is in violation of the spec. You can't do that.
The JSON property is a numeric string. e.g. "4"
or "2"
in '{"some_property":{"some_sub_property":{"4":10,"2":10}}}'
. To my knowledge it is a perfectly valid JSON. Is there a particular spec rule which states that a field cannot be made out of 0-9
characters only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A string containing a number is fine - that's still a string, which is what matters.
My comment about using integers as property names relates to your original post in this PR: using integers as property names is what you said that you are doing to trigger the error. I wasn't referring to the test content.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially I thought that integer keys of an array caused the issue in the domain model. However the model is normalized into an array before it is passed to JSON validator and the integer keys are already casted into regular strings. So at the end of the day, validator is not able to handle JSON fields which are numeric strings.
This was a dirty workaround, so I made a PR hoping me or someone else could come to this later.
I've tried to simplify the test input, but was unable to do it quickly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
validator is not able to handle JSON fields which are numeric strings.
This seems extremely odd - but would definitely be a bug if it is the case. I will investigate that today.
Is there any particular situation which is required to trigger it, or is it simply any property with a numeric string as the name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found a more simple example. I'll push it soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that casting into string
is a workaround and it should be traced why string from the input was casted into integer
in the first place 🤔
[ | ||
'{ | ||
"prop1": { | ||
"prop2": "a" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test passes without a fix.
[ | ||
'{ | ||
"prop1": { | ||
"123": "a" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test does not pass without a fix.
Turns out this is due to php design to silently cast keys which are numeric strings in array to integers.
php/php-src#9029 In this case, maybe the fix/workaround actually makes sense? |
I've been reading along but I must say my initial feel is the same as @erayd If I take the input for the test that fails without the fix and just decode it there is no array. $json = '{"prop1": {"123": "a"}}';
var_dump(json_decode($json));
// object(stdClass)#2 (1) {
// ["prop1"]=>
// object(stdClass)#1 (1) {
// ["123"]=>
// string(1) "a"
// }
//} So it think as mentioned earlier somewhere in code lies an issue where the json string is being decoded into an associative array (which does not exist in other programming languages besides PHP). The true fix should be there where the incorrect decoding takes place. The PR at least proofs there is an issue we weren't aware of so that helps a lot. It is now a matter of debugging where the issue lies. |
Actually what happens is that PHP cannot have array with keys which are numeric strings. They are casted into integers. So:
I think it makes sense to add a support for keys which are integer and being validated against |
Just to summary the issue: Prior to v6.1.0 it was supported to validate array input despite the fact does the input contain array keys as integers or not. v6.1.0 does not support array keys as integers anymore and breaks with the error.
Why is it important to support validating array keys as integers? Because PHP natively casts numeric strings to integers when used in array keys. This is not client/user issue, but how PHP works by design. Therefore it should be supported to validate arrays with integer keys as using numbers as keys in a JSON is a legitimate case. Otherwise, passing input as array to the validator is not reliable and should be deprecated in favour of |
JSON does not support non-string keys for object properties. Such a key is in violation of the spec. As such, it's not something we should support (although throwing an error if we encounter one would make good sense, so that the user knows they need to fix their data).
Thanks for the context - this is useful 🙂
JSON does not support associative arrays. There's no such thing as specifying array keys at all, of any type. JSON arrays are simply a list of values, with no keys at all. A zero-based index is used to address individual items within the list. How can attempting to validate an associative array be anything other than a user error?
Can you clarify, please? Passing objects is what you're already supposed to be doing, if you need to access items using a string key. |
I am not sure why are you referencing non-string keys in a JSON. As I mentioned before and as the test in this PR is, the keys in JSON are strings.
Again, the example is not associative array, but an object where property is a string.
Associated array is being validated because that's what jsonrainbow/json-schema is promoting for using as normalized data input. That's even being tested in lib's tests. ^^ actually after double-checking, I don't see the lib promoting to use |
Because you suggested that we support them (i.e. you suggested that we support int keys). I am telling you that is a spec violation, and we should therefore not support them.
Yep, we established that earlier. If this library is converting object string keys into ints internally somewhere, that is a bug which needs fixing.
Pretty sure we don't promote that anywhere, and the test you've linked looks like one of the "this should fail" tests (albeit I only glanced at it briefly). If you did find some documentation saying that we support it, please point that out, because it will need correcting.
Which of our tests use an associative array as input data? If they're positive tests, and do not contain a suitable pre-validation cast, then that is worthy of further investigation and potentially refactoring. Most of the tests are provided as JSON via the json-schema spec test suite.
Don't pass an associative array to the validator. It's not supported, has never been supported, and is a spec violation anyway. If you managed to do it prior to v6.1.0, that just means that we weren't throwing an error on your invalid data in a situation where we probably should have been. |
To clarify, I don't suggest to support integers in JSON input. I suggesting bringing back support of integers keys in validator when input is array.
I don't think the lib is converting string to ints. Ints are coming from user input. However, user cannot pass strings there as it is not possible by design in PHP.
Sorry, I pasted incorrect link. This one is correct.
json-schema/tests/ValidatorTest.php Line 13 in 7660882
Again, you are talking about 2 different things. JSON (single string) is not passed into a validator, but either stdClass or array is. It's fine that you decide to drop input as array, but this should be highlighted in the docs then. [Edit by @erayd: My apologies; I accidentally edited your post on my phone instead of quote-replying. I have reinstated your post as you originally provided it, and sorry for the confusion!] |
JSON arrays do not have keys. Providing an associative array as input data for validation is therefore a spec violation. It's not a case of "bringing back support", but rather, it's something that should throw an error if you attempt it.
The comment you link seems to be taking about associative arrays. As I have stated several times now, these are a spec violation; you cannot safely use them as input. Avoid them. Provide input that is spec-compliant. If you are referring to a problem with actual, spec-compliant objects, then can you please provide an example of what you mean? Because when I test using string keys on an object, they work precisely as expected: <?php
$a = new StdClass();
$a->{'an explicit string key'} = 'test';
$a->{'5'} = 'five';
var_dump($a);
That one is a type-casting test. It's only applicable when using the type-casting feature; it isn't a normal test mode. I'd need to check the code to be certain (it's been a number of years since I looked closely at that feature), but from memory it will attempt an internal cast-to-object if it sees something that looks like an assoc array.
See my comment above; this is a type-casting test.
This is a test that is explicitly designed to fail. It's a problem if it passes.
This is another type-casting test. As above, using that feature should result in an internal cast occurring.
You're missing the point entirely. The validator expects to see spec-compliant input. That means no assoc arrays, no references, no resource types, etc. The types and structure that you provide it must comply with the JSON spec. An assoc array does not comply, and you should not pass it here without casting. |
From my point of view, assoc array is just data structure to transport JSON. I am pointing out that this data structure isn't fully supported. You are keep mentioning compliance to JSON spec. Isn't these are 2 different things? I would expect the validator to accept This is a simple example which shows that clients should not use
Does it make sense to highlight in the docs that passing an array to the validator isn't expected? |
Of course it's not supported - that data structure is a spec violation. PHP's
Nope. The validator expects that its input should be valid JSON. For performance reasons, it accepts that input in decoded form (i.e.
While it's technically possible to pass any value type, you aren't supposed to do this. If you pass it things that don't comply with the spec, expect to get weird results.
It's not just about how it's iterated. The whole point of a JSON validator is to validate JSON. It isn't designed to validate things that are not JSON. When you provide a value as input to the validator, it expects that value to be valid (albeit pre-decoded) JSON.
That is not a simple example. Can you provide a one-liner please? Arrays (plain ones, zero-indexed with no holes) are fine. Associative arrays (which are essentially dictionaries with a multi-type key) are not.
Yes. I actually thought it already was somewhere, but if we don't mention that, then it absolutely needs adding to the docs. This is not the first time that someone has gotten tangled up by trying to use assoc arrays where they have no business being. More generally, the problem of incorrect types being provided by the user is why the type casting / coercion features exist. They are a quick & dirty way to "just make it go". But they cannot by the very nature of the types involved ever be guaranteed to make the correct assumptions in all cases, so they are best avoided unless you know precisely what the implications will be. It doesn't help that this library has a number of blind spots, and some of those can be triggered by providing data in the wrong type. Most of them have been fixed over the years, but there are still a few lurking around waiting for somebody to trip over 😞. The documentation generally needs a massive birthday. It's not nearly as comprehensive as IMO it should be. |
Thanks, that sheds the light on how the lib should be used 🙇
That makes sense. I guess we should rework our objects serialization before the validator usage in our app. |
@aistis- am I correct in assuming you and @erayd now have a shared understanding about why we dont want to support integer keys and that the problem lies in the library not providing any feedback about being misused and a general lack of documentation? If that is true I would opt to close this PR and open two new issues for the problems summarized above. |
As I understand, passing assoc array isn't supported even it worked in the past, have some tests and available I would probably keep the backwards compatibility and merge the fix to keep upgrades easy. |
Looking more detailed into this in code I can find small slithers of code or docs that refer to associative arrays
In the test I can see test called Trying to look at the PR from the reporter side, @aistis- took the time to report a bug he ran into since the 6.1.0 release. Although the input provided makes for some weird JSON and PHP does type juggling between JSON and associative arrays in both ways I now believe we should accept the PR, with a side note we are doing this to fix the regression introduced in 6.1.0 and we are still not 100% onboard with supporting associative arrays. The next mayor version might be where we drop associative arrays al together, I have opened up a discussion on that topic. @erayd I would like to give you the opportunity to have a final say in this matter before I decide to merge the PR. For future reference: // JSON Object with string property containing only numbers in associative array does type juggling from string to integer
var_dump(json_decode('{"123": "a"}', true));
// array(1) {
// [123]=>
// string(1) "a"
// }
// Associative array declaration with string key having only numbers does type juggling from string to integer
var_dump(['123' => 'a']);
// array(1) {
// [123]=>
// string(1) "a"
// } |
Hi, I have an error coming from this lib when validating a complex object. There is an object which holds a private property with an unsorted array where key is integer. The array does not start with 0, but may be any positive integer.
Unfortunately I haven't had time to investigate the exact reason, but maybe someone will 🤞