Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define object name duplicate behavior and require implementations support behavior unique #38

Open
zamicol opened this issue Jul 26, 2022 · 11 comments
Assignees

Comments

@zamicol
Copy link

zamicol commented Jul 26, 2022

Proposal

Even though JSON RFC 8259 states that "names within an object SHOULD be unique", it leaves duplicate object name behavior undefined. The RFC warns that only objects with unique names are guaranteed interoperability since "all software implementations receiving that object will agree on the name-value mappings", but it does not prohibit duplicate names.

I propose JSON5 explicitly defines object name duplicate behavior while guaranteeing full compatibility with JSON. Explicitly defining behavior increases system interoperability, removes the potential for bugs, and provides less surprises to users.

JSON5 should define the following four object name duplicate behaviors:

  • unique
  • last-value-wins
  • duplicate
  • undefined

unique requires JSON5 implementations to fail on duplicate object names.
last-value-wins requires JSON5 implementations to deduplicate and only report the last name/value pair.
duplicate requires JSON5 implementations to permit duplicates and preserves name/value pairs.
undefined permits JSON5 implementations to handle duplicate behavior in any way.

Further, JSON5 should

  • Require all JSON5 implementations MUST support unique.
  • Suggest implementations and applications SHOULD use unique.
  • Suggest implementations and applications SHOULD note duplicate behavior through documentation or other means.
  • Leave the method of behavior selection to implementations. For example, behavior selection may be implemented by a flag, global variable, or be implicit.

This proposal makes the distinction between applications and implementations. This proposal suggests that JSON5 implementations MUST define their behaviors and MUST support unique, but applications may behave however they like. The behavior of particular applications may be noted in documentation, conveyed by API's, or simply not documented at all. This proposal is not suggesting that particular applications or API's must support unique behavior. Also, this proposal is not suggesting standardizing a method for application behavior selection.

Related thoughts

Many JSON authority figures have expressed their desire for unique names. It's reasonable for JSON5, which makes minor improvements to JSON, to take this opportunity to implement this hope.

The behavior of Crockford's Java JSON implementation is to error on duplicate object names. Although not JSON5, Crockford's implementation already complies to this proposal since it supports unique.

Also, Crockford suggested modifying the JSON RFC to require unique object names, although it was decided it was too late to do so:

The names within an object SHOULD be unique. If a key is duplicated, a parser SHOULD reject. If it does not reject, it MUST take only the last of the duplicated key pairs.

Disallowing duplicates conforms to the small I-JSON RFC. The author of I-JSON, Tim Bray, is also the author of JSON RFC 8259

There's also security problems and interoperability problems with duplicates. See the article, "An Exploration of JSON Interoperability Vulnerabilities"

@jordanbtucker
Copy link
Member

Thank you for this well thought-out, well written proposal. Would you mind elaborating on the difference between implementation and application. Those aren't terms already described in the spec, so they may need to be defined if we include them.

On a side note, I just rediscovered that ES5 strict mode forbids duplicate property names.

@zamicol
Copy link
Author

zamicol commented Aug 8, 2022

Jordan, you've always done fantastic work on JSON5. Thank you for reading this proposal.

Perhaps JSON5 doesn't need to make the distinction? Instead, implementations MUST support unique, and SHOULD support last-value-wins, duplicate, and undefined, and leave it at that.

Looking at ES5, looks like the default behavior is last-value-wins.

@jordanbtucker
Copy link
Member

Thanks for the clarification. What is the value in defining an undefined behavior?

You're correct that the default behavior in ES5 is last-value-wins, unless it's in strict mode, then it's unique.

@nocturn9x
Copy link

I don't think having the spec allow for undefined behavior is a great idea. The point of a specification is to lay down a standard, and especially in this case having an UB is not only undesirable, but also probably useless.

Just my 2 cents though

@zamicol
Copy link
Author

zamicol commented Aug 9, 2022

Sorry, my response was poorly written. This is better:

Implementations MUST support duplicate behavior unique. Implementations MAY support additional duplicate behaviors.

Perhaps "non-standard" is a better word here than "undefined". If a JSON5 implementation selects a duplicate behavior that is not unique,last-value-wins, or duplicate, it is said to be non-standard.

An implementation that uses a non-standard behavior, such as duplicate_behavior:random-wins, needs to be permissible by JSON5 to be JSON compatible, but JSON5 itself doesn't need to define "non-standard" behaviors.

You're correct that the default behavior in ES5 is last-value-wins, unless it's in strict mode, then it's unique.

What section is that mentioned in?

@zamicol
Copy link
Author

zamicol commented Aug 15, 2022

Any more thoughts?

I suspect the next step is to draft an example by adding a few sentences about duplicate behavior to the spec section 3 Objects. Having a concrete spec to critique would be helpful.

@jordanbtucker
Copy link
Member

You're correct that the default behavior in ES5 is last-value-wins, unless it's in strict mode, then it's unique.

What section is that mentioned in?

https://262.ecma-international.org/5.1/#sec-C

It is a SyntaxError if strict mode code contains an ObjectLiteral with more than one definition of any data property (11.1.5).


Implementations MUST support duplicate behavior unique.

JSON5 is facing the same issue TC39 had with Crockford's similar suggestion. There already exists JSON5 documents and JSON5 implementations that behave based on the current spec, which uses SHOULD.

The clause "Implementations MUST support duplicate behavior unique." is a breaking change for JSON5 implementations. Any implementation that was developed against the current spec and did not include unique support, including the reference implementation, would immediately become non-compliant.

@zamicol
Copy link
Author

zamicol commented Aug 23, 2022

https://262.ecma-international.org/5.1/#sec-C

Ah! Thank you!

become non-compliant

Totally, and that's a great concern.

We've seen a "change" like this before:

The JSON RFC 7159 allowed UTF-8, UTF-16, or UTF-32.

The JSON RFC 8259 requires only UTF-8.

This decision was made after surveying implementations and noting that all popular implementations used UTF-8.

In the same way, we can survey implementations and see if any depend on other behaviors. For example, my guess would be that Javascript and Go implementations have behavior last-value-wins and Java would be unique.

I'm not sure how much consideration implementers have given into duplicate behavior. It may be that they want behavior unique, but have implemented whatever was the most convenient in their language of choice.

It would be interesting to get a temperature reading of the community's feelings on the issue. How would be the best way to get a response? A survey? Opening up a Github issue on the various implementations?

Is there a list of JSON5 implementations? That too might be a great place to start.

An alternative to forcing a specific behavior in the spec would be the spec could simply define behaviors, unique, last-value-wins, duplicate, and non-standard, suggest implementations SHOULD use unique, and leave it at that. Implementations should explicitly document what behavior is the default.


As a different matter, should I open up an issue in the reference implementation to support behavior unique? Alternatively, I can do a pull request just to document it for now.

@jordanbtucker
Copy link
Member

Is there a list of JSON5 implementations? That too might be a great place to start.

Yes, there is a community maintained list at In the Wild.

As a different matter, should I open up an issue in the reference implementation to support behavior unique? Alternatively, I can do a pull request just to document it for now.

Yes, that would be a welcome issue / PR. The reference implementation uses last-value-wins because that's what ES5 does, and I used the parsing steps from that spec to write the code. I don't think it's the best behavior, and I'm happy to include a unique option, which is off by default in this version, but perhaps on by default in v3.

@zamicol
Copy link
Author

zamicol commented Sep 13, 2022

It is a SyntaxError if strict mode code contains an ObjectLiteral with more than one definition of any data property (11.1.5).

Perhaps this changed with ES6? Do you know of an example that will trigger SyntaxError on duplicates?

I suspected something like this would throw SyntaxError. Instead it results in last-value-wins. Looking at the ES6 spec, I can't find reference to duplicate behavior.

tc = {'bob':'bob','bob':'bob2'};
console.log(tc); // prints `{bob: 'bob2'}`

@jordanbtucker
Copy link
Member

jordanbtucker commented Sep 15, 2022

It looks like they removed that restriction in ES6 with the introduction of computed property names. Since the following code would initialize an object with duplicate property names, using last-value-wins, with no way for the compiler to detect the duplicate names, ECMA decided to remove the restriction.

const key1 = 'foo'
const key2 = 'foo'

const object = {
  [key1]: 'bar',
  [key2]: 'baz',
}

console.log(object) // { foo: 'baz' }

See also https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Object_initializer#duplicate_property_names

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants