Adds --verify #195

jspaezp · 2024-02-18T20:48:21Z

Discussed here #153

The idea is to have a cli flag that works as --dry-run BUT returns a status 1 if any change would have been made on the files.

This PR is still a work in progress, right now its compatible with the behavior of the rest of the flags but I think we need to add a test case that verifies its behavior directly (2 files, one that gets changed, one that doesnt and make sure they exit with the correct error) an additional test I can think of is that no file should return status 1 if it has already been stripped (so .... step 1, verify it returns status 1, make sure no changes are made; step 2 run bstripout, step 3 verify that it returns status 0 on the modified file).

kynan · 2024-03-17T10:08:12Z

nbstripout/_nbstripout.py

    if args.mode == 'zeppelin':
        nb = json.load(input_stream, object_pairs_hook=collections.OrderedDict)
+        nb_str_orig = json.dumps(nb, indent=2)


What's the reason for writing to a JSON formatted string here and below? Could you also just compare the dicts directly? Not necessarily objecting, just trying to understand why you've implemented it this way :)

I was afraid of instances where the dict is different but the serialization is the same; for instance if for some reason the strip zepellin converts a list to a tuple. Both are equivalent in json but not as python dicts. (It might also prevent some edge cases where copying vs deep cloning might be an issue)

Example

import json d1 = {"a": 1, "b": [1,]} d2 = {"a": 1, "b": (1,)} print(json.dumps(d1) == json.dumps(d2)) # True print(d2 == d1) # False print(json.dumps(d2)) # {"a": 1, "b": [1]}

(I am not 100% sure if my concerns are totally grounded, but felt better).

You are indeed correct that both strip_output and strip_zeppelin_output mutate the existing dict, so we'd probably need to create a deep copy for comparison purposes. I'd prefer that over the JSON serialization.

nbstripout/_nbstripout.py

kynan · 2024-03-17T10:15:11Z

nbstripout/_nbstripout.py

+            any_local_change = process_notebook(input_stream, output_stream, args, extra_keys)
+            any_change = any_change or any_local_change


Suggested change

any_local_change = process_notebook(input_stream, output_stream, args, extra_keys)

any_change = any_change or any_local_change

if process_notebook(input_stream, output_stream, args, extra_keys):

any_change = True

Wouldn't this be easier?

kynan · 2024-03-17T10:34:52Z

tests/test_end_to_end.py

    with open(NOTEBOOKS_FOLDER / expected_file, mode="r") as f:
        expected = f.read()
+        expected_str = json.dumps(json.loads(expected), indent=2)


Same question: why do you need to convert to JSON?

Inside the test my main concern was that line endings might be different between the input and the output files. (If my memory is not failing me...)

Although to be fair this might have been an overstep. since it actually changes the test ... it tests whether the content is equivalent instead of testing that the file is actually the same. (lmk if I should revert it here)

This should indeed not be required. When I investigated this I was going crazy cause there was a persistently failing test, until I realized this was due to some extra leading whitespace in one of the test notebook input files, which I fixed in 5077ab4. Can you please rebase your PR on main and then make those changes?

thanks for the fix, will do.

Co-authored-by: Florian Rathgeber <[email protected]>

kynan · 2024-03-24T14:49:05Z

nbstripout/_nbstripout.py

    if args.mode == 'zeppelin':
        nb = json.load(input_stream, object_pairs_hook=collections.OrderedDict)
+        nb_str_orig = json.dumps(nb, indent=2)


You are indeed correct that both strip_output and strip_zeppelin_output mutate the existing dict, so we'd probably need to create a deep copy for comparison purposes. I'd prefer that over the JSON serialization.

kynan · 2024-03-24T14:51:27Z

tests/test_end_to_end.py

    with open(NOTEBOOKS_FOLDER / expected_file, mode="r") as f:
        expected = f.read()
+        expected_str = json.dumps(json.loads(expected), indent=2)


This should indeed not be required. When I investigated this I was going crazy cause there was a persistently failing test, until I realized this was due to some extra leading whitespace in one of the test notebook input files, which I fixed in 5077ab4. Can you please rebase your PR on main and then make those changes?

first feature commit

21d7638

kynan reviewed Mar 17, 2024

View reviewed changes

jspaezp and others added 3 commits March 18, 2024 02:28

Update nbstripout/_nbstripout.py

e6ca675

Co-authored-by: Florian Rathgeber <[email protected]>

Update nbstripout/_nbstripout.py

5eb68f0

Co-authored-by: Florian Rathgeber <[email protected]>

Update nbstripout/_nbstripout.py

9fb0d0f

Co-authored-by: Florian Rathgeber <[email protected]>

kynan requested changes Mar 24, 2024

View reviewed changes

FabienDanieau mentioned this pull request Aug 14, 2024

check notebook before commit pollen-robotics/reachy2-sdk#343

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds --verify #195

Adds --verify #195

jspaezp commented Feb 18, 2024

kynan Mar 17, 2024

jspaezp Mar 18, 2024

kynan Mar 24, 2024

kynan Mar 17, 2024

kynan Mar 17, 2024

jspaezp Mar 18, 2024

kynan Mar 24, 2024

jspaezp Mar 25, 2024

kynan Mar 24, 2024

kynan Mar 24, 2024

		any_local_change = process_notebook(input_stream, output_stream, args, extra_keys)
		any_change = any_change or any_local_change

Adds --verify #195

Are you sure you want to change the base?

Adds --verify #195

Conversation

jspaezp commented Feb 18, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment