-
-
Notifications
You must be signed in to change notification settings - Fork 185
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fix!: reject duplicate submissions (#5047)
## Summary Implemented logic to detect and reject duplicate submissions. ## Description We have identified a race condition in the submission processing that causes duplicate submissions with identical UUIDs and XML hashes. This issue is particularly problematic under conditions with multiple remote devices submitting forms simultaneously over unreliable networks. To address this issue, a PR has been raised with the following proposed changes: - Race Condition Resolution: A locking mechanism has been added to prevent the race condition when checking for existing instances and creating new ones. This aims to eliminate duplicate submissions. - UUID Enforcement: Submissions without a UUID are now explicitly disallowed. This ensures that every submission is uniquely identifiable and further mitigates the risk of duplicate entries. - Introduction of `root_uuid`: - To ensure a consistent submission UUID throughout its lifecycle and prevent duplicate submissions with the same UUID, a new `root_uuid` column has been added to the `Instance` model with a unique constraint (`root_uuid` per `xform`). - If the `<meta><rootUuid>` is present in the submission XML, it is stored in the `root_uuid` column. - If `<meta><rootUuid>` is not present, the value from `<meta><instanceID>` is used instead. - This approach guarantees that the `root_uuid` remains constant across the lifecycle of a submission, providing a reliable identifier for all instances. - UUID Handling Improvement: Updated the logic to strip only the `uuid:` prefix while preserving custom, non-UUID ID schemes (e.g., domain.com:1234). This ensures compliance with the OpenRosa spec and prevents potential ID collisions with custom prefixes. - Error Handling: - 202 Accepted: Returns when content is identical to an existing submission and successfully processed. - 409 Conflict: Returns when a duplicate UUID is detected but with differing content. These changes should improve the robustness of the submission process and prevent both race conditions and invalid submissions. ## Notes - Implemented a fix to address the race condition that leads to duplicate submissions with the same UUID and XML hash. - Incorporated improvements from existing work, ensuring consistency and robustness in handling concurrent submissions. - The fix aims to prevent duplicate submissions, even under high load and unreliable network conditions. ## Related issues Supersedes [kobotoolbox/kobocat#876](kobotoolbox/kobocat#876) and kobotoolbox/kobocat#859 --------- Co-authored-by: Olivier Leger <[email protected]>
- Loading branch information
1 parent
d22b8b5
commit 9598180
Showing
37 changed files
with
915 additions
and
264 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,18 +1,33 @@ | ||
import multiprocessing | ||
import os | ||
from collections import defaultdict | ||
from functools import partial | ||
|
||
import pytest | ||
import requests | ||
import simplejson as json | ||
from django.conf import settings | ||
from django.contrib.auth.models import AnonymousUser | ||
from django.core.files.base import ContentFile | ||
from django.core.files.uploadedfile import InMemoryUploadedFile | ||
from django.test.testcases import LiveServerTestCase | ||
from django_digest.test import DigestAuth | ||
from rest_framework.authtoken.models import Token | ||
|
||
from kobo.apps.kobo_auth.shortcuts import User | ||
from kobo.apps.openrosa.apps.main.models import UserProfile | ||
from kobo.apps.openrosa.libs.tests.mixins.request_mixin import RequestMixin | ||
from rest_framework import status | ||
|
||
from kobo.apps.openrosa.apps.api.tests.viewsets.test_abstract_viewset import ( | ||
TestAbstractViewSet, | ||
) | ||
from kobo.apps.openrosa.apps.api.viewsets.xform_submission_api import XFormSubmissionApi | ||
from kobo.apps.openrosa.apps.logger.models import Attachment | ||
from kobo.apps.openrosa.apps.main import tests as main_tests | ||
from kobo.apps.openrosa.libs.constants import CAN_ADD_SUBMISSIONS | ||
from kobo.apps.openrosa.libs.utils.guardian import assign_perm | ||
from kobo.apps.openrosa.libs.utils import logger_tools | ||
from kobo.apps.openrosa.libs.utils.logger_tools import ( | ||
OpenRosaResponseNotAllowed, | ||
OpenRosaTemporarilyUnavailable, | ||
|
@@ -212,7 +227,7 @@ def test_post_submission_authenticated_bad_json(self): | |
self.assertTrue('error' in rendered_response.data) | ||
self.assertTrue( | ||
rendered_response.data['error'].startswith( | ||
'Received empty submission' | ||
'Instance ID is required' | ||
) | ||
) | ||
self.assertTrue(rendered_response.status_code == 400) | ||
|
@@ -418,7 +433,7 @@ def test_submission_account_inactive(self): | |
|
||
def test_submission_blocking_flag(self): | ||
# Set 'submissions_suspended' True in the profile to test if | ||
# submission do fail with the flag set | ||
# submission does fail with the flag set | ||
self.xform.user.profile.submissions_suspended = True | ||
self.xform.user.profile.save() | ||
|
||
|
@@ -536,3 +551,108 @@ def test_submission_customizable_confirmation_message(self): | |
self.assertContains( | ||
response, 'Successful submission.', status_code=201 | ||
) | ||
|
||
|
||
class ConcurrentSubmissionTestCase(RequestMixin, LiveServerTestCase): | ||
""" | ||
Inherit from LiveServerTestCase to be able to test concurrent requests | ||
to submission endpoint in different transactions (and different processes). | ||
Otherwise, DB is populated only on the first request but still empty on | ||
subsequent ones. | ||
""" | ||
|
||
def setUp(self): | ||
self.user = User.objects.create_user( | ||
username='bob', password='bob', email='[email protected]' | ||
) | ||
self.token, _ = Token.objects.get_or_create(user=self.user) | ||
UserProfile.objects.get_or_create(user=self.user) | ||
|
||
def publish_xls_form(self): | ||
path = os.path.join( | ||
settings.OPENROSA_APP_DIR, | ||
'apps', | ||
'main', | ||
'tests', | ||
'fixtures', | ||
'transportation', | ||
'transportation.xls', | ||
) | ||
|
||
with open(path, 'rb') as f: | ||
xls_file = ContentFile(f.read(), name='transportation.xls') | ||
|
||
self.xform = logger_tools.publish_xls_form(xls_file, self.user) | ||
|
||
@pytest.mark.skipif( | ||
settings.SKIP_TESTS_WITH_CONCURRENCY, | ||
reason='GitLab does not seem to support multi-processes' | ||
) | ||
def test_post_concurrent_same_submissions(self): | ||
DUPLICATE_SUBMISSIONS_COUNT = 2 # noqa | ||
|
||
self.publish_xls_form() | ||
username = 'bob' | ||
survey = 'transport_2011-07-25_19-05-49' | ||
results = defaultdict(int) | ||
|
||
with multiprocessing.Pool() as pool: | ||
for result in pool.map( | ||
partial( | ||
submit_data, | ||
live_server_url=self.live_server_url, | ||
survey_=survey, | ||
username_=username, | ||
token_=self.token.key | ||
), | ||
range(DUPLICATE_SUBMISSIONS_COUNT), | ||
): | ||
results[result] += 1 | ||
|
||
assert results[status.HTTP_201_CREATED] == 1 | ||
assert results[status.HTTP_202_ACCEPTED] == DUPLICATE_SUBMISSIONS_COUNT - 1 | ||
|
||
|
||
def submit_data(identifier, survey_, username_, live_server_url, token_): | ||
""" | ||
Submit data to live server. | ||
It has to be outside `ConcurrentSubmissionTestCase` class to be pickled by | ||
`multiprocessing.Pool().map()`. | ||
""" | ||
media_file = '1335783522563.jpg' | ||
main_directory = os.path.dirname(main_tests.__file__) | ||
path = os.path.join( | ||
main_directory, | ||
'fixtures', | ||
'transportation', | ||
'instances', | ||
survey_, | ||
media_file, | ||
) | ||
with open(path, 'rb') as f: | ||
f = InMemoryUploadedFile( | ||
f, | ||
'media_file', | ||
media_file, | ||
'image/jpg', | ||
os.path.getsize(path), | ||
None, | ||
) | ||
submission_path = os.path.join( | ||
main_directory, | ||
'fixtures', | ||
'transportation', | ||
'instances', | ||
survey_, | ||
f'{survey_}.xml', | ||
) | ||
with open(submission_path) as sf: | ||
files = {'xml_submission_file': sf, 'media_file': f} | ||
headers = {'Authorization': f'Token {token_}'} | ||
response = requests.post( | ||
f'{live_server_url}/{username_}/submission', | ||
files=files, | ||
headers=headers | ||
) | ||
return response.status_code |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2 changes: 1 addition & 1 deletion
2
...ps/logger/fixtures/test_forms/survey_names/instances/survey_names_2012-08-17_11-24-53.xml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
<?xml version='1.0' ?><survey_names id="survey_names"><start>2012-08-17T11:24:53.254+03</start><name>HD</name><end>2012-08-17T11:24:57.897+03</end></survey_names> | ||
<?xml version='1.0' ?><survey_names id="survey_names"><start>2012-08-17T11:24:53.254+03</start><name>HD</name><end>2012-08-17T11:24:57.897+03</end><meta><instanceID>uuid:729f173c688e482486a48661700455ff</instanceID></meta></survey_names> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file added
BIN
+697 KB
...s/logger/fixtures/tutorial/instances/tutorial_with_attachment/1335783522563.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+6.21 MB
...es/tutorial_with_attachment/attachment_with_different_content/1335783522563.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
17 changes: 17 additions & 0 deletions
17
...tutorial/instances/tutorial_with_attachment/tutorial_2012-06-27_11-27-53_w_attachment.xml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
<?xml version='1.0' ?> | ||
<tutorial id="tutorial"> | ||
<name>Larry | ||
Again | ||
</name><!-- newline to make sure we can handle it --> | ||
<age>23</age> | ||
<picture>1335783522563.jpg</picture> | ||
<has_children>0</has_children> | ||
<gps>-1.2836198 36.8795437 0.0 1044.0</gps> | ||
<web_browsers>firefox chrome safari</web_browsers> | ||
<meta> | ||
<instanceID>uuid:729f173c688e482486a48661700455ff</instanceID> | ||
</meta> | ||
<formhub> | ||
<uuid>c2f940a2f1e8490d8d3c3030452ed401</uuid> | ||
</formhub> | ||
</tutorial> |
18 changes: 18 additions & 0 deletions
18
...ial/instances/tutorial_with_attachment/tutorial_2012-06-27_11-27-53_w_attachment_edit.xml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
<?xml version='1.0' ?> | ||
<tutorial id="tutorial"> | ||
<name>Larry | ||
Again | ||
</name><!-- newline to make sure we can handle it --> | ||
<age>23</age> | ||
<picture>1335783522563.jpg</picture> | ||
<has_children>0</has_children> | ||
<gps>-1.2836198 36.8795437 0.0 1044.0</gps> | ||
<web_browsers>firefox chrome safari</web_browsers> | ||
<meta> | ||
<instanceID>uuid:031c585a-88b7-4962-a566-849a7dd15891</instanceID> | ||
<deprecatedID>uuid:729f173c688e482486a48661700455ff</deprecatedID> | ||
</meta> | ||
<formhub> | ||
<uuid>c2f940a2f1e8490d8d3c3030452ed401</uuid> | ||
</formhub> | ||
</tutorial> |
Oops, something went wrong.