Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

delete_at_pointer #1071

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

RoyBellingan
Copy link

Some provisional code to add support for the delete_at_pointer functionality.

To test with

https://github.com/RoyBellingan/boostDeleteAtPointer

It should produce this

Initial Json:
[
    {
        "image" : {
            "path" : "https://i.natgeofe.com/n/4f5aaece-3300-41a4-b2a8-ed2708a0a27c/domestic-dog_thumb_square.jpg?wp=1&w=170&h=170",
            "size" : 9100
        },
        "comment" : [
            {
                "text" : "this is cool",
                "timestamp" : 123456,
                "likes" : [
                    {
                        "author" : "Coco",
                        "timestamp" : 123
                    },
                    {
                        "author" : "Izzy",
                        "timestamp" : 456
                    }
                ]
            }
        ]
    }
]
Removing invalid ptr /1/image 
past-the-end token not supported @ /1
------
Removing invalid ptr /0/image/invalid 
no referenced value @ /0/image/invalid
------
Removing /0/comment/0/text 
[
    {
        "image" : {
            "path" : "https://i.natgeofe.com/n/4f5aaece-3300-41a4-b2a8-ed2708a0a27c/domestic-dog_thumb_square.jpg?wp=1&w=170&h=170",
            "size" : 9100
        },
        "comment" : [
            {
                "likes" : [
                    {
                        "author" : "Coco",
                        "timestamp" : 123
                    },
                    {
                        "author" : "Izzy",
                        "timestamp" : 456
                    }
                ],
                "timestamp" : 123456
            }
        ]
    }
]

------
Removing /0/comment/0/likes/1 
[
    {
        "image" : {
            "path" : "https://i.natgeofe.com/n/4f5aaece-3300-41a4-b2a8-ed2708a0a27c/domestic-dog_thumb_square.jpg?wp=1&w=170&h=170",
            "size" : 9100
        },
        "comment" : [
            {
                "likes" : [
                    {
                        "author" : "Coco",
                        "timestamp" : 123
                    }
                ],
                "timestamp" : 123456
            }
        ]
    }
]

------
Removing /0/comment/0 
[
    {
        "image" : {
            "path" : "https://i.natgeofe.com/n/4f5aaece-3300-41a4-b2a8-ed2708a0a27c/domestic-dog_thumb_square.jpg?wp=1&w=170&h=170",
            "size" : 9100
        },
        "comment" : [

        ]
    }
]

------
Removing /0/image 
[
    {
        "comment" : [

        ]
    }
]

------

@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://1071.json.prtest2.cppalliance.org/libs/json/doc/html/index.html

@cppalliance-bot
Copy link

@vinniefalco
Copy link
Member

Why does this PR kill the performance in the benchmarks?
image

@RoyBellingan
Copy link
Author

I will try to do another PR with same code, I am confident @vinniefalco that above was just some temporary fluctuation on the vm running the test.

@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://1071.json.prtest2.cppalliance.org/libs/json/doc/html/index.html

…on/1791/15/2 ./boost/json/impl/pointer.ipp:457:18: error: 'kind' is not a class, namespace, or enumeration
@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://1071.json.prtest2.cppalliance.org/libs/json/doc/html/index.html

@cppalliance-bot
Copy link

@cppalliance-bot
Copy link

Unreachable code removed.
@cppalliance-bot
Copy link

An automated preview of the documentation is available at https://1071.json.prtest2.cppalliance.org/libs/json/doc/html/index.html

Copy link

codecov bot commented Feb 15, 2025

Codecov Report

Attention: Patch coverage is 0% with 63 lines in your changes missing coverage. Please review.

Project coverage is 93.06%. Comparing base (c02d872) to head (c829369).

Files with missing lines Patch % Lines
include/boost/json/impl/pointer.ipp 0.00% 63 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1071      +/-   ##
===========================================
- Coverage    93.70%   93.06%   -0.65%     
===========================================
  Files           91       91              
  Lines         9139     9202      +63     
===========================================
  Hits          8564     8564              
- Misses         575      638      +63     
Files with missing lines Coverage Δ
include/boost/json/value.hpp 98.83% <ø> (ø)
include/boost/json/impl/pointer.ipp 79.41% <0.00%> (-20.59%) ⬇️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c02d872...c829369. Read the comment docs.

@cppalliance-bot
Copy link

@vinniefalco
Copy link
Member

@sdarwin once again I question these benchmarks....

@sdarwin
Copy link
Collaborator

sdarwin commented Feb 17, 2025

At least at this moment, there is not a basis to think the benchmark is wrong. I will run a test and attempt to replicate that result.

@sdarwin
Copy link
Collaborator

sdarwin commented Feb 17, 2025

Just as expected, the results can be easily replicated. :-) See sdarwin#20 , which shows the image below.
If a benchmark is questionable then re-run the same job by pushing a commit with a small whitespace modification and nothing else.

bench

@grisumbras
Copy link
Member

I'm not entirely sold on this idea. Given that you also want to return the part of the pointer string that was matched (which is not done for other JSON Pointer functions), it appears to me there's a need for lower-level Pointer functionality, with which users could build their own algorithms.

Alternative idea for a simple "delete at pointer" API: add a special "undefined" state to value_ref that erases the corresponding element when used with set_at_pointer.

@grisumbras
Copy link
Member

@sdarwin the question isn't whether benchmark results are random. You showed us several times that they are not. The question is why they are so much affected by random changes. It could be that the benchmark runner program is not very good, but I remember at least in one case a change in a CMake script (that did not relate to the benchmarks) resulted in benchmark performance degradation.

The only way I can explain this phenomenon is that the build process itself exhausts some resource, or results in some file loading/unloading and that changes the overall performance of the system. Maybe we could try building the benchmarks, storing the build artifacts, and then launching the benchmarks with those artifacts in a fresh container?

@sdarwin
Copy link
Collaborator

sdarwin commented Feb 19, 2025

The current benchmark in this pull request is the following:

bench8

That is mostly around 3% or less. Some 0%. It's not wildly unexpected.

However, @grisumbras you should understand as well as anyone, even a very small code change (a wrongly placed "if", a missing "return", a bit or byte in the wrong place) can generate a detour in the code path, such that the code is very buggy, and the results are anyone's guess. It may be accurate and correct for the benchmarks to deviate by 30% or 100%, or 10000%, when there is a bug, a mistake, in the pull request.

@grisumbras
Copy link
Member

I wouldn't expect bugs in this PR's code (whether they exist or not) to affect benchmarks, given that none of this new code is ever executed by the benchmarks.

Regardless, my comment should have been considered more generally. Consider this PR: #1031. It just causes warnings in certain cases. Shouldn't affect runtime at all. And yet several tests are affected quite dramatically.

@RoyBellingan
Copy link
Author

@sdarwin
Copy link
Collaborator

sdarwin commented Feb 19, 2025

Consider this PR: #1031.

Checking notes... that exact week was when the newest server khjson1.cpp.al went into production for the first time.
That's a sea of 0% . :-) It's a great success. My interpretation of that result is:

  • mostly, the code had no affect on benchmarks
  • on specifically "twitter" and "twitterescaped", there was (unknown) benchmark instability and variation. It wasn't caused by the pull request itself. However, at the time, we discussed "twitter" and "twitterescaped" briefly, you had made some remarks, and maybe even changed something in the json code, such that those benches improved later on.
    In recent times, an example of margin-of-error in the benchmarks, and no affect from the PR, is in the recent PR 1073. This is randomness. but pretty good:

bench15

@sdarwin
Copy link
Collaborator

sdarwin commented Feb 19, 2025

@RoyBellingan , there is no evidence of "noisy neighbors" usually: this is not a virtual machine on shared hardware, it is a separate server.

@grisumbras
Copy link
Member

@RoyBellingan after some thinking, I'm not sure that producing an error when an element is not found makes much sense. The effect of the operation is that the element is not nested in the value. If it already wasn't there, the effect is the same. So, the function should mostly report an error if the pointer string is invalid. In this case, if error is not set and true is returned, the element was found and removed, and if error is not set and false is returned, the element wasn't found.

An aside, I think the function should be called erase_at_pointer, to better match the conventional C++ nomenclature. It's also a tiny bit shorter.

In addition, there's a question of what to do with a pointer that designates the root object (an empty string). As far as I can see your algorithm doesn't handle it in a sensible manner. Even if it did, it obviously cannot actually remove the root value, so I'm inclined to think that this situation should actually be an error. Probably a new one.

I also want you to remove the tracking of the consumed part of the Pointer string. None of the other Pointer functions do that right now, and it would be strange to introduce it only for this function. We have a potential idea on how to implement such tracking in the future.

Finally, for the PR to be eventually merged, it should also contain all the necessary overloads (similar to set_at_pointer) and tests. All new lines should be covered by tests, unless there's a reason it can't be reasonably done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants