[WIP] feat(`/sources`): add bulk `DELETE` endpoint #7228

cfm · 2024-09-09T22:09:47Z

Status

Work in progress:

Tests
Expanded test plan?

Description of Changes

In support of freedomofpress/securedrop-client#2160, this pull request proposes a Journalist API endpoint for deleting sources in bulk with $$O(1)$$ instead $$O(n)$$ latency from the perspective of a remote (over Tor) client. This is a strict addition to the v1 Journalist API: no breaking changes; no side effects.

Testing

Needs tests. For now you can test with make dev with (e.g.):

$ curl -X DELETE -d "$(curl http://localhost:8081/api/v1/sources | jq ['.sources[] | .uuid]')" -H "Content-Type: application/json" http://localhost:8081/api/v1/sources
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
				 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:-100 16231  100 16231    0     0   489k      0 --:--:-- --:--:-- --:--:--  495k
{
  "failed": [],
  "succeeded": [
    "cf7bd746-80ce-4e1f-8c94-668fde45347b",
    "57d67752-4a63-4d72-8177-4e883248abe7",
    "bd900dbd-7bfa-4a50-b485-b969a2f0351e"
  ]
}

Deployment

Checklist

If you made changes to the server application code:

Linting (make lint) and tests (make test) pass in the development container

If you made non-trivial code changes:

I have written a test plan and validated it for this PR

Choose one of the following:

I have opened a PR in the docs repo for these changes, or will do so later
I would appreciate help with the documentation
These changes do not require documentation

Strict addition to the v1 Journalist API for deleting sources in bulk with O(1) instead O(n) latency from the perspective of a remote (over Tor) client. No breaking changes; no side effects. Deliberately overdocumented for now. You can test with "make dev" with (e.g.): $ curl -X DELETE -d "$(curl http://localhost:8081/api/v1/sources | jq ['.sources[] | .uuid]')" -H "Content-Type: application/json" http://localhost:8081/api/v1/sources % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-100 16231 100 16231 0 0 489k 0 --:--:-- --:--:-- --:--:-- 495k { "failed": [], "succeeded": [ "cf7bd746-80ce-4e1f-8c94-668fde45347b", "57d67752-4a63-4d72-8177-4e883248abe7", "bd900dbd-7bfa-4a50-b485-b969a2f0351e" ] }

cfm · 2024-09-10T16:10:22Z

We all know Tor's latency penalty firsthand, but when I say $$O(1)$$ instead of $$O(n)$$, I mean:

# Just a GET to the Source Interface, but the difference is negligible versus
# a POST/DELETE to the Journalist Interface:
[workstation user ~]% time ./fetch_source_interface.sh 1                                                                                              
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3098  100  3098    0     0   1111      0  0:00:02  0:00:02 --:--:--  1111
./fetch_source_interface.sh 1  0.01s user 0.02s system 1% cpu 2.823 total
# So, to delete 10 sources in 1 bulk operation, that would be ~3 seconds from
# button-press to feedback in the Client...
[workstation user ~]% time ./fetch_source_interface.sh 10
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3098  100  3098    0     0   1063      0  0:00:02  0:00:02 --:--:--  1063
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3098  100  3098    0     0   1227      0  0:00:02  0:00:02 --:--:--  1227
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3098  100  3098    0     0   1099      0  0:00:02  0:00:02 --:--:--  1098
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3098  100  3098    0     0    743      0  0:00:04  0:00:04 --:--:--   743
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3098  100  3098    0     0    680      0  0:00:04  0:00:04 --:--:--   679
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3098  100  3098    0     0   1234      0  0:00:02  0:00:02 --:--:--  1235
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3098  100  3098    0     0    828      0  0:00:03  0:00:03 --:--:--   828
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3098  100  3098    0     0    692      0  0:00:04  0:00:04 --:--:--   692
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3098  100  3098    0     0    783      0  0:00:03  0:00:03 --:--:--   782
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3098  100  3098    0     0   1227      0  0:00:02  0:00:02 --:--:--  1228
./fetch_source_interface.sh 10  0.17s user 0.25s system 1% cpu 34.609 total
# ...versus ~35 seconds to delete 10 sources in 10 individual jobs.

legoktm · 2024-09-10T16:26:52Z

Overall I think this is a good idea, especially given the known latency wins. IMO the complexity primarily comes in how we do error handling (fail fast? keep trying?).

I would like to see some limit imposed (maybe 50?) just so we're not running close to timeouts in worst case scenarios of large sources and/or legacy GPG keys. I'm not sure if that means the client should also impose a limit or if it needs to implement batching, which is why I'm bringing it up now.

cfm · 2024-09-11T23:17:55Z

I'm moving this out of draft status just so discussion will appear on Slack. It's still WIP.

We discussed this today and concluded:

We're open to making non-breaking ~~changes~~ extensions to the Journalist API like this in support of features on the Client roadmap. We didn't have consensus on whether extensions should increment the API version number. I would argue (a) that they should not and (b) that any proposal that needs to do so is by definition a breaking change and therefore out of current scope.
This doesn't block Upstream "Delete Sources" (batch-delete) implementation securedrop-client#2160, which can proceed with the existing one-shot DELETE /sources/<source_uuid> endpoint until and unless we get this DELETE /sources endpoint.
We'll consider this either for v2.11 or perhaps even a dedicated v2.x release.
@rocodes and I will look at @legoktm's suggestion in [WIP] feat(/sources): add bulk DELETE endpoint #7228 (comment) of a maximum batch size and figure out whether that should be implemented on the Client side at the UI level (restrict the batch to the maximum size) or the job level (divide the batch into sub-batches of at most the maximum size).
- I note that either way we'll need rigorous test cases for failure and retry modes, especially of failed and partially successful (HTTP 207 Multi-Status) batches.

cfm · 2024-10-17T17:57:42Z

@zenmonkeykstop, to clarify from backlog-pruning, I have no further implementation work planned here. There's a naïve test plan here already, but I need to add tests. Let me know if you'd like me to expand the test plan as well. Otherwise the implementation here is ready for review and could be released at any time with no breaking changes on the Server or downstream changes required in the Client.

cfm · 2024-11-21T21:02:58Z

I've marked this for v2.11 for discussion when we plan our next sprint.

legoktm · 2024-11-22T03:50:22Z

securedrop/journalist_app/api.py

+
+            # Deletion is idempotent, so count nonexistent `Source`s as
+            # successes.
+            except NoResultFound:


The try/except should only cover exactly the Source.query.filter().one() line, so any error in delete_collection/append is not caught by this.

Or just use .one_or_none() and avoid catching NoResultFound entirely.

legoktm · 2024-11-22T03:51:29Z

securedrop/journalist_app/api.py

+        transit.
+        """
+        data = request.json
+        if not isinstance(data, list):


Do you also want to check that all the contents of the list are strings?

legoktm · 2024-11-22T03:52:23Z

securedrop/journalist_app/api.py

+                    "succeeded": succeeded,
+                }
+            ),
+            200 if len(failed) == 0 else 207,  # Success or Multi-Status


If they all fail, should that still be 207? I think that's an acceptable compromise to keep the complexity of the client part down.

cfm · 2025-01-22T19:11:57Z

Deferred to v2.13, unless I happen to be able to get this ready for review this sprint.

cfm added needs/discussion queued up for discussion at future team meeting. Use judiciously. needs/test-plan labels Sep 9, 2024

cfm changed the title ~~feat(/sources): add bulk DELETE endpoint~~ [WIP] feat(/sources): add bulk DELETE endpoint Sep 11, 2024

cfm marked this pull request as ready for review September 11, 2024 23:17

cfm requested a review from a team as a code owner September 11, 2024 23:17

cfm self-assigned this Sep 11, 2024

feat(/sources): limit DELETE requests to MAX_BATCH_SIZE

6253662

cfm force-pushed the bulk-delete-sources branch from 23d3e22 to 6253662 Compare September 12, 2024 21:52

zenmonkeykstop self-requested a review October 17, 2024 17:47

cfm removed the needs/discussion queued up for discussion at future team meeting. Use judiciously. label Oct 17, 2024

cfm mentioned this pull request Oct 24, 2024

Support deletion of multiple sources freedomofpress/securedrop-client#2252

Merged

26 tasks

cfm added this to the SecureDrop 2.11.0 milestone Nov 21, 2024

legoktm reviewed Nov 22, 2024

View reviewed changes

cfm modified the milestones: SecureDrop 2.11.0, SecureDrop 2.12.0 Nov 25, 2024

cfm removed their assignment Nov 25, 2024

zenmonkeykstop modified the milestones: SecureDrop 2.12.0, SecureDrop 2.13.0 Jan 22, 2025

cfm self-assigned this Jan 22, 2025

cfm added the needs/tests label Jan 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] feat(`/sources`): add bulk `DELETE` endpoint #7228

[WIP] feat(`/sources`): add bulk `DELETE` endpoint #7228

cfm commented Sep 9, 2024 •

edited

Loading

cfm commented Sep 10, 2024 •

edited

Loading

legoktm commented Sep 10, 2024

cfm commented Sep 11, 2024 •

edited

Loading

cfm commented Oct 17, 2024

cfm commented Nov 21, 2024

legoktm Nov 22, 2024

legoktm Nov 22, 2024

legoktm Nov 22, 2024

cfm commented Jan 22, 2025

[WIP] feat(/sources): add bulk DELETE endpoint #7228

Are you sure you want to change the base?

[WIP] feat(/sources): add bulk DELETE endpoint #7228

Conversation

cfm commented Sep 9, 2024 • edited Loading

Status

Description of Changes

Testing

Deployment

Checklist

If you made changes to the server application code:

If you made non-trivial code changes:

cfm commented Sep 10, 2024 • edited Loading

legoktm commented Sep 10, 2024

cfm commented Sep 11, 2024 • edited Loading

cfm commented Oct 17, 2024

cfm commented Nov 21, 2024

legoktm Nov 22, 2024

Choose a reason for hiding this comment

legoktm Nov 22, 2024

Choose a reason for hiding this comment

legoktm Nov 22, 2024

Choose a reason for hiding this comment

cfm commented Jan 22, 2025

[WIP] feat(`/sources`): add bulk `DELETE` endpoint #7228

[WIP] feat(`/sources`): add bulk `DELETE` endpoint #7228

cfm commented Sep 9, 2024 •

edited

Loading

cfm commented Sep 10, 2024 •

edited

Loading

cfm commented Sep 11, 2024 •

edited

Loading