Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSONPath backend comparison #236

Open
gavv opened this issue Jan 24, 2023 · 8 comments
Open

JSONPath backend comparison #236

gavv opened this issue Jan 24, 2023 · 8 comments
Labels
help wanted Contributions are welcome important Important task

Comments

@gavv
Copy link
Owner

gavv commented Jan 24, 2023

This is a bit unusual issue, it's not about coding, but about doing research.

We're currently using yalp/jsonpath as JSONPath backend. It works well, however the author stated in README that it's experimental and users should switch to another engines.

Here are some other engines that I've found:

ojg looks promising, but to proceed, we need to compare it with yalp/jsonpath.

Ideally, we need two things:

  • a table of JSONPAth syntax features that summarizes what is supported and not by yalp/jsonpath and ojg
  • two test sets: one is a list of queries that work exactly the same for both parsers; one is a list of queries that work differently (e.g. one of the parser fails or maybe they return different results)

After we'll have this, we can decide whether it's possible to switch without breaking compatibility, and if not, we'll be able to provide migration guide for users.

We already have a test that covers many queries, though likely not all of them: https://github.com/gavv/httpexpect/blob/master/value_test.go#L441

Previous discussion: #49

Useful materials:

@gavv gavv added help wanted Contributions are welcome important Important task labels Jan 24, 2023
@gavv gavv changed the title JSONPath backend research JSONPath backend comparison Jan 27, 2023
@gavv gavv pinned this issue Feb 1, 2023
@stubents
Copy link

stubents commented Oct 2, 2023

Hi,

I'd really love to see ojg json paths in httpexprect, because the support for subexpressions and filters is quite handy.
So I ran the test suite of yalp/jsonpath agains ojg, here's my quick take :)

Returned data types:

  • yalp only returns slices when the path expression includes recursive searches or wildcards. With ojg you have to choose between Get(...) (always a slice) or First(...)
  • however, if nothing is found ojg returns nil instead of an empty slice
  • If you reference an element that does not exists, yalp returns an error and ojg an empty slice

Supported syntax (as mentioned ojg supports a lot more here, here's just what's missing):

  • ojg doesn't support bracket syntax without quotes ($[A], but that's not the consensus anyway)

Behaviour:

  • ojg interprets negative steps with open intervals differently from yalp and most other libraries (e.g. $.A[::-1] always produces an empty result in ojg)
  • yalp does not go with the consensus about the order of recursive descents (so the results are most of the time roughly backwards)
  • ojg can make sense of those queries while yalp (and me) can't: $..1, $.1, $.., .A

Hope this helps

@stubents
Copy link

stubents commented Oct 4, 2023

By the way... here's the hacky code I used: https://github.com/stubents/yalp-vs-ojg/blob/main/yalp_vs_ojg_test.go
And that's the table-like output it produces:

Running tests on:
{
  "expensive": 10,
  "store": {
    "bicycle": {
      "color": "red",
      "price": 19.95
    },
    "book": [
      {
        "author": "Nigel Rees",
        "category": "reference",
        "price": 8.95,
        "title": "Sayings of the Century"
      },
      {
        "author": "Evelyn Waugh",
        "category": "fiction",
        "price": 12.99,
        "title": "Sword of Honour"
      },
      {
        "author": "Herman Melville",
        "category": "fiction",
        "isbn": "0-553-21311-3",
        "price": 8.99,
        "title": "Moby Dick"
      },
      {
        "author": "J. R. R. Tolkien",
        "category": "fiction",
        "isbn": "0-395-19395-8",
        "price": 22.99,
        "title": "The Lord of the Rings"
      }
    ]
  }
}
Works the same way:   $.store.book[*].author 	ojg: [Nigel Rees Evelyn Waugh Herman Melville J. R. R. Tolkien] in  yalp: [Nigel Rees Evelyn Waugh Herman Melville J. R. R. Tolkien]
Works the same way:   $..author      	        ojg: [Nigel Rees Evelyn Waugh Herman Melville J. R. R. Tolkien] in  yalp: [Nigel Rees Evelyn Waugh Herman Melville J. R. R. Tolkien]


Running tests on:
{
  "A": [
    "string",
    23.3,
    3,
    true,
    false,
    null
  ],
  "B": "value",
  "C": 3.14,
  "D": {
    "C": 3.1415,
    "V": [
      "string2a",
      "string2b",
      {
        "C": 3.141592
      }
    ]
  },
  "E": {
    "A": [
      "string3"
    ],
    "D": {
      "V": {
        "C": 3.14159265
      }
    }
  },
  "F": {
    "V": [
      "string4a",
      "string4b",
      {
        "CC": 3.1415926535
      },
      {
        "CC": "hello"
      },
      [
        "string5a",
        "string5b"
      ],
      [
        "string6a",
        "string6b"
      ]
    ]
  }
}
Works the same way:   $.A            	ojg: [string 23.3 3 true false <nil>]                   in  yalp: [string 23.3 3 true false <nil>]
Works the same way:   $.A[*]         	ojg: [string 23.3 3 true false <nil>]                   in  yalp: [string 23.3 3 true false <nil>]
Works the same way:   $.A.*          	ojg: [string 23.3 3 true false <nil>]                   in  yalp: [string 23.3 3 true false <nil>]
Works the same way:   $.A.*.a        	ojg: []                                                 in  yalp: []
Works the same way:   $              	ojg: map[A:[string 23.3 3 true false <nil>] B:value C:3.14 D:map[C:3.1415 V:[string2a string2b map[C:3.141592]]] E:map[A:[string3] D:map[V:map[C:3.14159265]]] F:map[V:[string4a string4b map[CC:3.1415926535] map[CC:hello] [string5a string5b] [string6a string6b]]]] in  yalp: map[A:[string 23.3 3 true false <nil>] B:value C:3.14 D:map[C:3.1415 V:[string2a string2b map[C:3.141592]]] E:map[A:[string3] D:map[V:map[C:3.14159265]]] F:map[V:[string4a string4b map[CC:3.1415926535] map[CC:hello] [string5a string5b] [string6a string6b]]]]
Works the same way:   $.A[0]         	ojg: string                                             in  yalp: string
Works the same way:   $["A"][0]      	ojg: string                                             in  yalp: string
Works the same way:   $.A[1:4]       	ojg: [23.3 3 true]                                      in  yalp: [23.3 3 true]
Works the same way:   $.A[:-1]       	ojg: [string 23.3 3 true false]                         in  yalp: [string 23.3 3 true false]
Works the same way:   $.F.V[4:5][0,1] 	ojg: [string5a string5b]                                in  yalp: [string5a string5b]
Works the same way:   $.A[-2:]       	ojg: [false <nil>]                                      in  yalp: [false <nil>]
Works the same way:   $.F.V[4:6]     	ojg: [[string5a string5b] [string6a string6b]]          in  yalp: [[string5a string5b] [string6a string6b]]
Works the same way:   $.A[1,4,2]     	ojg: [23.3 false 3]                                     in  yalp: [23.3 false 3]
Works the same way:   $["B","C"]     	ojg: [value 3.14]                                       in  yalp: [value 3.14]
Works the same way:   $["C","B"]     	ojg: [3.14 value]                                       in  yalp: [3.14 value]
Works the same way:   $.F.V[4,5][0:2] 	ojg: [string5a string5b string6a string6b]              in  yalp: [string5a string5b string6a string6b]
Works the same way:   $.A[::2]       	ojg: [string 3 false]                                   in  yalp: [string 3 false]
Works differntly:     $.A[::-1]      	ojg: []                                                 in  yalp: [<nil> false true 3 23.3 string]
Works the same way:   $.F.V[4:6][1]  	ojg: [string5b string6b]                                in  yalp: [string5b string6b]
Works the same way:   $.F.V[4:6][0,1] 	ojg: [string5a string5b string6a string6b]              in  yalp: [string5a string5b string6a string6b]
Not supported in ojg: $[A][0]        	(parse error at 3 in $[A][0])
Works the same way:   $["A"][0]      	ojg: string                                             in  yalp: string
Not supported in ojg: $[B,C]         	(parse error at 3 in $[B,C])
Works the same way:   $["B","C"]     	ojg: [value 3.14]                                       in  yalp: [value 3.14]
Works the same way:   $..V[2,3].CC   	ojg: [3.1415926535 hello]                               in  yalp: [3.1415926535 hello]
Works differntly:     $..["C"]       	ojg: [3.14159265 3.141592 3.1415 3.14]                  in  yalp: [3.14 3.1415 3.141592 3.14159265]
Works the same way:   $.A..*         	ojg: [string 23.3 3 true false <nil>]                   in  yalp: [string 23.3 3 true false <nil>]
Works the same way:   $.A.*          	ojg: [string 23.3 3 true false <nil>]                   in  yalp: [string 23.3 3 true false <nil>]
Works differntly:     $..A[0]        	ojg: [string3 string]                                   in  yalp: [string string3]
Works differntly:     $.*.V[0,1]     	ojg: [string4a string4b string2a string2b]              in  yalp: [string2a string2b string4a string4b]
Works the same way:   $.*.V[2].C     	ojg: [3.141592]                                         in  yalp: [3.141592]
Works the same way:   $.*.V[2:3].*   	ojg: [3.141592 3.1415926535]                            in  yalp: [3.141592 3.1415926535]
Works differntly:     $..A..*        	ojg: [string3 string 23.3 3 true false <nil>]           in  yalp: [string 23.3 3 true false <nil> string3]
Works differntly:     $..V[*].*      	ojg: [3.1415926535 hello string5a string5b string6a string6b 3.141592] in  yalp: [3.141592 3.1415926535 hello string5a string5b string6a string6b]
Works the same way:   $.D.*..C       	ojg: [3.141592]                                         in  yalp: [3.141592]
Works differntly:     $.D.V..*       	ojg: [3.141592 string2a string2b map[C:3.141592]]       in  yalp: [string2a string2b map[C:3.141592] 3.141592]
Works the same way:   $.*.V[2:4].*   	ojg: [3.141592 3.1415926535 hello]                      in  yalp: [3.141592 3.1415926535 hello]
Works the same way:   $..V[2:4].CC   	ojg: [3.1415926535 hello]                               in  yalp: [3.1415926535 hello]
Works differntly:     $..[0]         	ojg: [string5a string6a string4a string3 string2a string] in  yalp: [string string2a string3 string4a string5a string6a]
Works the same way:   $.D.V.*.C      	ojg: [3.141592]                                         in  yalp: [3.141592]
Works the same way:   $.*.V..C       	ojg: [3.141592]                                         in  yalp: [3.141592]
Works differntly:     $..D..V..C     	ojg: [3.14159265 3.141592]                              in  yalp: [3.141592 3.14159265]
Works differntly:     $..A           	ojg: [[string3] [string 23.3 3 true false <nil>]]       in  yalp: [[string 23.3 3 true false <nil>] [string3]]
Works the same way:   $..V[*].C      	ojg: [3.141592]                                         in  yalp: [3.141592]
Works the same way:   $.D.V..C       	ojg: [3.141592]                                         in  yalp: [3.141592]
Works the same way:   $.*.D.V.C      	ojg: [3.14159265]                                       in  yalp: [3.14159265]
Works the same way:   $.*.D.V..*     	ojg: [3.14159265]                                       in  yalp: [3.14159265]
Works differntly:     $.*.*.*.C      	ojg: [3.14159265 3.141592]                              in  yalp: [3.141592 3.14159265]
Works differntly:     $.*.V[0:2]     	ojg: [string4a string4b string2a string2b]              in  yalp: [string2a string2b string4a string4b]
Works the same way:   $..V[2].C      	ojg: [3.141592]                                         in  yalp: [3.141592]
Works the same way:   $.*.V[2].*     	ojg: [3.141592 3.1415926535]                            in  yalp: [3.141592 3.1415926535]
Works differntly:     $..C           	ojg: [3.14159265 3.141592 3.1415 3.14]                  in  yalp: [3.14 3.1415 3.141592 3.14159265]
Works differntly:     $..V..C        	ojg: [3.14159265 3.141592]                              in  yalp: [3.141592 3.14159265]
Works differntly:     $..A[0,1]      	ojg: [string3 string 23.3]                              in  yalp: [string 23.3]
Works the same way:   $.*.V[0]       	ojg: [string2a string4a]                                in  yalp: [string2a string4a]
Works differntly:     $.*.V[1]       	ojg: [string4b string2b]                                in  yalp: [string2b string4b]
Works the same way:   $..ZZ          	ojg: []                                                 in  yalp: []
Works the same way:   $.D.V..*.C     	ojg: [3.141592]                                         in  yalp: [3.141592]
Works the same way:   $.*.D..C       	ojg: [3.14159265]                                       in  yalp: [3.14159265]

Both produce error:   $.A*]
Both produce error:   $[C:B]
Both produce error:   $.A[1:4:0:0]
No Error with ojg:    $.1
Both produce error:   $[A][0
Both produce error:   $["]
No Error with ojg:    $..
Both produce error:   $[B,C
Both produce error:   $.A[1,4.2]
Both produce error:   $.
Both produce error:   $.A[]
Both produce error:   $.*V
Both produce error:   $.A[:,]
No Error with ojg:    $..1
No Error with ojg:    .A
No Error with ojg:    $.ZZZ

@gavv
Copy link
Owner Author

gavv commented Oct 7, 2023

Thanks a lot for looking into this. This is very helpful.

I think ojg would be a very good addition. Seems that differences are big enough to add it as a separate method instead of replacing existing one. Currently we have Path(). We can mark it deprecated, add new method, say, Query(), and provide migration guide. Since migration can be quite painful, I think despite deprecation Path() will be never deleted actually.

In addition, I think we should make migration as smooth as we can.

yalp only returns slices when the path expression includes recursive searches or wildcards. With ojg you have to choose between Get(...) (always a slice) or First(...)

Ideally, we need to handle it somehow. I think if Query() will unconditionally return Array/slice value, it would be both inconvenient and complicate migration.

Do you think we can reliably implement a trick like you're doing here, but by inspecting parsed jp.Expr? It seems that Expr is slice of Frag interfaces, and implementations of Frag are exported types, so probably we can iterate over fragments and check their underlying types?

however, if nothing is found ojg returns nil instead of an empty slice

I guess we can detect nil and make the behavior similar to current behavior of Path().

If you reference an element that does not exists, yalp returns an error and ojg an empty slice

Not a big deal, because if httpexpect get error from yalp, it fails the test; it means that users likely don't have tests with such queries in their tests, so migration won't break anything here.

Also, the new behavior makes more sense, because it becomes possible to assert lack of certain element.

ojg doesn't support bracket syntax without quotes ($[A], but that's not the consensus anyway)

ojg interprets negative steps with open intervals differently from yalp and most other libraries (e.g. $.A[::-1] always produces an empty result in ojg)

So these two things are the major breaking points, especially the first one, and it seems we can't do anything with it except documenting in migration guide.

yalp does not go with the consensus about the order of recursive descents (so the results are most of the time roughly backwards)

This is another breaking point, but not so important, because hopefully not much tests will be tied to specific order, especially given that the existing order is strange.

ojg can make sense of those queries while yalp (and me) can't: $..1, $.1, $.., .A

Hopefully other users can't too :)

@gavv
Copy link
Owner Author

gavv commented Oct 7, 2023

So here are the steps I see for this task:

  • see if we can automatically detect single-value vs multi-value query
  • see if we can automatically detect case when nothing found
  • implement new method Query() that will do two detections above to behave more similar to Path()
  • upgrade test suite based on code you provided:
    • if needed, add more tests to cover all important queries (e.g. I guess error cases are missing?)
    • extend tests so that each query tests both Path() and Query(), and provide alternative results for them when needed
  • write migration guide (can be section in README or comment to Query): list what is now unsupported, what changed, what's new features became available; and add link to our test suite
  • deprecate Path() and recommend to use Query() instead (I think we'll start from "soft" deprecation, i.e. we'll say it in comment, but won't add special "Deprecated:" comment that will trigger warnings; we'll probably add it in next major release)
  • update examples and README

@gavv
Copy link
Owner Author

gavv commented Oct 7, 2023

@stubents Thanks for your help and let me know if you wish to work on any of these.

@stubents
Copy link

Ideally, we need to handle it somehow. I think if Query() will unconditionally return Array/slice value, it would be both inconvenient and complicate migration.

Do you think we can reliably implement a trick like you're doing here, but by inspecting parsed jp.Expr? It seems that Expr is slice of Frag interfaces, and implementations of Frag are exported types, so probably we can iterate over fragments and check their underlying types?

That hack from the test won't work for many edge cases like $.A["*"], inspecting the parsed Expr sound more promising. However, what do you think about creating two methods like Query() and QueryAll()?
This would give the user more control and might protect httpexpect from having to handle unexpected edge cases.

I guess we can detect nil and make the behavior similar to current behavior of Path().

maybe it's okay to keep the nil. If you replaced it with an empty slice you cannot distinguish between finding an empty array and not finding anything. I think this might be a downside of the current yalp implementation (e.g. $..ZZ: I think ojg would actually return a nil there. It's just listed as [] above because of one of those hacks that make the results comparable. If there would be something in the json like {"ZZ": []}, ojg returns something different but yalp still the same).

@stubents Thanks for your help and let me know if you wish to work on any of these.

Sure, I can give it a try, I just can't promise when I'll find time for it. Anyway, since this issue isn't very recent, I guess it won't be time critical :)

@stubents
Copy link

Ah.. no yalp would return [[]], I guess :)
Not sure what I like better

@stubents
Copy link

stubents commented May 21, 2024

Hi @gavv
I gave it a try: #446
I changed my mind about two methods and tried to implement it just as you described in #236 (comment)
If you like the PR so far, I'd add the migration guide and the rest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Contributions are welcome important Important task
Projects
None yet
Development

No branches or pull requests

2 participants