Add cleaned option #3234

ksotik · 2023-07-26T20:41:55Z

This option allow to disable server-side values validation for all types.
Also fix "safe" option docs and default value inverse logic.

tillprochaska · 2023-07-27T13:43:02Z

aleph/views/collections_api.py

+    # This will disable (if True) values validation for all types of all entities / properties
+    # (will pass cleaned=True to the model.get_proxy() in the aleph/logic/processing.py)
+    cleaned = get_flag("cleaned", default=False)
+    # Flag is only available for admins:
+    if not request.authz.is_admin:
+        cleaned = False
+


Hey @ksotik, thanks for opening your first PR in the Aleph repo! :) 🎉

Could you add a test for the new flag?
https://github.com/alephdata/aleph/blob/develop/aleph/tests/test_collections_api.py

tillprochaska · 2023-07-27T14:01:25Z

aleph/views/collections_api.py

    # loading of document data:
    safe = get_flag("safe", default=True)
    # Flag is only available for admins:
    if not request.authz.is_admin:
        safe = True

+    # This will disable (if True) values validation for all types of all entities / properties
+    # (will pass cleaned=True to the model.get_proxy() in the aleph/logic/processing.py)
+    cleaned = get_flag("cleaned", default=False)


Slightly concerned about the fact that the two flags cleaned and safe do related things but behave in opposite ways when set to true:

cleaned=true: Assume that the data can be trusted and contains only valid values.

safe=true: Assume that the data can NOT be trusted, so perform additional security measures.

Not sure though if there is a better alternative and probably not a huge problem as this is mostly relevant in edge cases. A validate flag is used in some other endpoint. It’s has a slightly different behavior though (validate=true means that Aleph will return an error instead of silently discarding invalid values).

Oh, maybe the flag could be clean instead of cleaned (and default to true)?

clean=true means that the data cannot be trusted and that the data should be cleaned from invalid values.

safe=true means that the data cannot be trusted and that file checksums should be removed.

Rosencrantz · 2023-09-28T09:10:03Z

@tillprochaska Can you take another look at this, ensure that this is now up to your standards.

tillprochaska

Really sorry this took so long, but thanks a lot for opening the PR and making these changes. I’ll also merge the respective alephclient PR.

* Add cleaned option * Change cleaned=True to clean=False; add clean flag tests

Add cleaned option

f6ef048

tillprochaska reviewed Jul 27, 2023

View reviewed changes

Change cleaned=True to clean=False; add clean flag tests

49a0dcb

tillprochaska approved these changes Dec 6, 2023

View reviewed changes

tillprochaska merged commit 0044f5b into alephdata:develop Dec 6, 2023

simonwoerpel pushed a commit to investigativedata/aleph that referenced this pull request Apr 22, 2024

Add cleaned option (alephdata#3234)

b6f9d3a

* Add cleaned option * Change cleaned=True to clean=False; add clean flag tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add cleaned option #3234

Add cleaned option #3234

ksotik commented Jul 26, 2023

tillprochaska Jul 27, 2023

ksotik Jul 28, 2023

tillprochaska Jul 27, 2023

tillprochaska Jul 27, 2023

ksotik Jul 28, 2023

Rosencrantz commented Sep 28, 2023

tillprochaska left a comment

Add cleaned option #3234

Add cleaned option #3234

Conversation

ksotik commented Jul 26, 2023

tillprochaska Jul 27, 2023

Choose a reason for hiding this comment

ksotik Jul 28, 2023

Choose a reason for hiding this comment

tillprochaska Jul 27, 2023

Choose a reason for hiding this comment

tillprochaska Jul 27, 2023

Choose a reason for hiding this comment

ksotik Jul 28, 2023

Choose a reason for hiding this comment

Rosencrantz commented Sep 28, 2023

tillprochaska left a comment

Choose a reason for hiding this comment