-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON support #51
JSON support #51
Conversation
cb0ac78
to
dcd06cc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try and add more.
I really do love the pervasive use of ?
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is excellent ! Very well documented an implemented.
My comments are mostly about renaming and comments.
include "Deserializer.dfy" | ||
include "ZeroCopy/API.dfy" | ||
|
||
module {:options "-functionSyntax:4"} JSON.API { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's think about names for a moment. Names are the crown of all the hard work you have been putting into this library, and now you have time to step back and think about it ^^.
Everything in library is an API, so using API as a name makes little sense. What about just JSON
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot use just JSON
, unfortunately :( That's because definitions in the JSON module cannot use definitions in submodules of it, like JSON.XYZ
.
I considered having two top-level modules JSON and JSONImpl, but that's not great either.
Finally, this respects the link between module names and file names (JSON.API
in JSON/Api
).
We ought to improve this in the new module system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keeping JSON.API, it would be possible to create a file in the folder above JSON.dfy that refines this module, so that users don't need to write JSON.API? Could you do this? I don't see myself writing JSON.API to invoke JSON routines. JavaScript offers JSON.stringify and JSON.parse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would introduce a mismatch between the file name and the file path. You can, however, do import JSON.API as JS
, which is good, I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is good, but not good enough for what we want to be a default library. I'll take a pass on that later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed this is good enough given the whole repo reserves the right to rename things for now.
import opened BoundedInts | ||
|
||
type View = v: View_ | v.Valid? witness View([], 0, 0) | ||
datatype View_ = View(s: bytes, beg: uint32, end: uint32) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might have preferred "beg" so that it aligns nicely with "end", but aligning is not relevant from the API's interface. "beg" is not a word, and I like that datatype libraries use regular words for identifiers.
I strongly suggest you use "start" here or "beginning"
Could you rename the variable "s" as "bytes" ? If it causes conflict with the type, I would prefer that you rename the type byteSeq so that it's more transparent as a type and bytes
can be used as the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These fields (and the constructor) are intended to be internal, so keeping them short and (somewhat) opaque isn't necessarily a bad thing, I think. It's important for encapsulation to not peek at the underlying string or at the bounds.
In contrast, renaming s
to bytes
would be misleading, because there is a public function Bytes
that gives you s[beg..end]
.
That said, I'm OK to rename if you're not convinced.
src/JSON/Utils/Views.dfy
Outdated
else At(0) as opt_byte | ||
} | ||
|
||
method Blit(bs: array<byte>, start: uint32 := 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to rename Blit
to WriteTo
?
also, "bs" => "bytes"
"start" => "offset" (so that you can use "start" instead of "beg" as in my comment above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you; Blit is a bit opaque. But I think it should be CopyTo
, to make it clear that the original object isn't affected.
bs
is the convention throughout, but I agree with you that it's not the best. However bytes
isn't informative either; it just repeats the type. I think the right name would be target
or dest
; I chose dest
based on https://stackoverflow.com/questions/60532503/whats-different-between-words-target-and-dest
start
I would prefer to not change, because other functions have end
(here since it's a public API I didn't use beg
)
src/JSON/ZeroCopy/API.dfy
Outdated
bs := Serializer.Serialize(js); | ||
} | ||
|
||
method SerializeBlit(js: Grammar.JSON, bs: array<byte>) returns (len: SerializationResult<uint32>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blit? What does that mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://en.wikipedia.org/wiki/Bit_blit
It means overwriting (or updating) part of an array from another datastructure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The link is just for information; I renamed it of course :)
include "DivMod.dfy" | ||
include "Power.dfy" | ||
|
||
module Logarithm { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you perhaps name this module/file Integers
? That way, we would have Integers.Log
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't how the rest of this library is structured; I tried to stay consistent with what was already there.
(Also, if you named it Integers
, wouldn't you want to have other modules also named Integers
in other files?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmmh that makes me think it should just be included in a Math module so that we have Math.log. Let's do that later.
src/JSON/Deserializer.dfy
Outdated
js.At(0) == 't' as byte | ||
} | ||
|
||
function Unescape(str: string, start: nat := 0): DeserializationResult<string> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you did not write a verifier for Escape/Unescape, could you please add a comment in front of each function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, but just in front of Unescape: Escape
is the spec :)
const NAT8_MAX: nat8 := 0x7F | ||
const NAT16_MAX: nat16 := 0x7FFF | ||
const NAT32_MAX: nat32 := 0x7FFFFFFF | ||
const NAT64_MAX: nat64 := 0x7FFFFFFF_FFFFFFFF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There might be some redundancy with the PR with the Heap. Could you synchronize with @prvshah51 to ensure we define these constant in only one place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Parva didn't include these in the end; right? So no conflict.
Co-authored-by: Mikaël Mayer <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few changes to request, but otherwise looks good !
// TODO: Verify this function | ||
function Unescape(str: string, start: nat := 0): DeserializationResult<string> | ||
function {:tailrecursion} {:vcs_split_on_every_assert} Unescape(str: seq<uint16>, start: nat := 0, prefix: seq<uint16> := []): DeserializationResult<seq<uint16>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the TODO above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, not because I don't think it would be valuable to add more specification to this function, but because that will be quite a bit of work. I also don't want to imply there's no verification happening in this code.
src/JSON/Deserializer.dfy
Outdated
} | ||
|
||
function String(js: Grammar.jstring): DeserializationResult<string> { | ||
Transcode8To16Unescaped(js.contents.Bytes()) | ||
// TODO Optimize with a function by method |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this TODO be more precise about what can be optimized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed it: since that TODO was left there, I make Unescape tail-recursive to address the worst performance issue. There's definitely more to do here and in the Unicode module, but that's a more widespread challenge to take on in the future.
src/JSON/ZeroCopy/Deserializer.dfy
Outdated
assert cs.Bytes() == Spec.Bracketed(sp.t, SuffixedElementSpec) + close.cs.Bytes() by { | ||
assert cs.Bytes() == Spec.Structural(open.t, SpecView) + SuffixedElementsSpec(elems.t) + Spec.Structural(close.t, SpecView) + close.cs.Bytes() by { | ||
assert cs.Bytes() == Spec.Structural(open.t, SpecView) + open.cs.Bytes(); | ||
assert open.cs.Bytes() == SuffixedElementsSpec(elems.t) + elems.cs.Bytes(); | ||
assert elems.cs.Bytes() == Spec.Structural(close.t, SpecView) + close.cs.Bytes(); | ||
Seq.Assoc'(Spec.Structural(open.t, SpecView), SuffixedElementsSpec(elems.t), elems.cs.Bytes()); | ||
Seq.Assoc'(Spec.Structural(open.t, SpecView) + SuffixedElementsSpec(elems.t), Spec.Structural(close.t, SpecView), close.cs.Bytes()); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert cs.Bytes() == Spec.Bracketed(sp.t, SuffixedElementSpec) + close.cs.Bytes() by { | |
assert cs.Bytes() == Spec.Structural(open.t, SpecView) + SuffixedElementsSpec(elems.t) + Spec.Structural(close.t, SpecView) + close.cs.Bytes() by { | |
assert cs.Bytes() == Spec.Structural(open.t, SpecView) + open.cs.Bytes(); | |
assert open.cs.Bytes() == SuffixedElementsSpec(elems.t) + elems.cs.Bytes(); | |
assert elems.cs.Bytes() == Spec.Structural(close.t, SpecView) + close.cs.Bytes(); | |
Seq.Assoc'(Spec.Structural(open.t, SpecView), SuffixedElementsSpec(elems.t), elems.cs.Bytes()); | |
Seq.Assoc'(Spec.Structural(open.t, SpecView) + SuffixedElementsSpec(elems.t), Spec.Structural(close.t, SpecView), close.cs.Bytes()); | |
} | |
} | |
calc { | |
cs.Bytes(); | |
Spec.Structural(open.t, SpecView) + open.cs.Bytes(); | |
{ assert open.cs.Bytes() == SuffixedElementsSpec(elems.t) + elems.cs.Bytes(); } | |
Spec.Structural(open.t, SpecView) + (SuffixedElementsSpec(elems.t) + elems.cs.Bytes()); | |
{ Seq.Assoc'(Spec.Structural(open.t, SpecView), SuffixedElementsSpec(elems.t), elems.cs.Bytes()); } | |
Spec.Structural(open.t, SpecView) + SuffixedElementsSpec(elems.t) + elems.cs.Bytes(); | |
{ assert elems.cs.Bytes() == Spec.Structural(close.t, SpecView) + close.cs.Bytes(); } | |
Spec.Structural(open.t, SpecView) + SuffixedElementsSpec(elems.t) + (Spec.Structural(close.t, SpecView) + close.cs.Bytes()); | |
{ Seq.Assoc'(Spec.Structural(open.t, SpecView) + SuffixedElementsSpec(elems.t), Spec.Structural(close.t, SpecView), close.cs.Bytes()); } | |
Spec.Structural(open.t, SpecView) + SuffixedElementsSpec(elems.t) + Spec.Structural(close.t, SpecView) + close.cs.Bytes(); | |
Spec.Bracketed(sp.t, SuffixedElementSpec) + close.cs.Bytes(); | |
} |
Calc statements are much nicer to reason about and provide the same flexibility as assert ... by, including the ability to provide local proofs. For example, the calls to the two associativity lemmas can be done as a hint.
I verified visually that this proves the same thing as the nested assert by above, but please give it a try in the verifier.
I tested this proof locally, the Resource count drops from 470k to 426k as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NICE, thanks! I had that thought too but didn't want to get greedy once we got everything to verify. :)
} | ||
assert elems'.StrictlySplitFrom?(cs0, SuffixedElementsSpec); | ||
assert forall e | e in elems'.t :: e.suffix.NonEmpty? by { assert elems'.t == elems.t + [suffixed]; } | ||
assert {:split_here} elems'.cs.Length() < elems.cs.Length(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to apply the same calc technique there but couldn't arrive to something that would verify quickly enough.
Co-authored-by: Mikaël Mayer <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me ! Great to have.
Tracking under #119 (and https://github.com/dafny-lang/libraries/milestone/1 in general) rather than delaying merging this any longer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a formatting bug that will fix one issue later
dafny-lang/dafny#3960
but for now it's good enough :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a formatting bug that will fix one issue later
dafny-lang/dafny#3960
but for now it's good enough :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a formatting bug that will fix one issue later
dafny-lang/dafny#3960
but for now it's good enough :-)
The library is documented in
JSON/README.md
, and there is a tutorial inJSON/Tutorial.dfy
.