Add unit tests for publish2cloud.py #162

boolean5 · 2020-07-27T16:29:19Z

Implement unit tests for chunk_metadata, new_data_to_publish_to_s3 and publish_to_s3.
Will close #130.

boolean5 · 2020-08-03T14:29:22Z

I added tests for new_data_to_publish_to_s3. Also, I refactored this function and fixed some minor bugs. For a full description of the changes see the commit message of 5aaf927.

Note: One of the changes was returning True right away when dealing with a new list that has not been uploaded to S3 yet. Previously, we were creating an S3 key with dummy list contents and comparing against its checksum. We should test this on staging by temporarily introducing a new section in the configuration file and confirming that the corresponding list is added to S3.

boolean5 · 2020-08-04T14:32:08Z

Just added unit tests for publish_to_s3

Return the checksum of the contents of the list file starting right after the chunk header. Previously, we were always starting from the 26th byte, regardless of the size of the header.

Move the code that uses the configuration file to determine if a list should be uploaded to S3 outside of `new_data_to_publish_to_s3` and into `publish_to_cloud`. Rename `check_upload_remote_settings_config` to `check_upload_config` and adjust it to work both for S3 and remote settings. In `publish_to_cloud` make sure that `new_data_to_publish_to_s3` is only called when `upload_to_s3` is True, to avoid the cost of unnecessarily accessing S3. In addition, when the key corresponding to a list does not yet exist on S3, return True right away instead of creating a key with dummy list contents and comparing against its checksum. Also, fix a minor bug in getting the name of the S3 key from the configuration file. Previously, if the section did not have the `s3_key` option, `new_data_to_publish_to_s3` would raise a `NoOptionError` instead of falling back to using the `output` option as the S3 key name. Raise a ValueError when the `s3_key` option exists but is empty. Finally, add a docstring to describe the functionality of `new_data_to_publish_to_s3`.

Add moto as a requirement and implement unit tests for new_data_to_publish_to_s3.

Add unit tests to check that the expected S3 key permissions are set in new_data_to_publish_to_s3 and publish_to_s3. These tests are currently expected to fail because setting key permissions with add_user_grant() does not work in moto.

say-yawn

LGTM with one minor change requested.

say-yawn · 2020-08-17T15:49:31Z

publish2cloud.py

-    eoh = header.find('\n')
-    chunktype, chunknum, hash_size, data_len = header[:eoh].split(':')
+    header = fp.readline().rstrip('\n')
+    chunktype, chunknum, hash_size, data_len = header.split(':')
    return dict(
        type=chunktype, num=chunknum, hash_size=hash_size, len=data_len,
        checksum=hashlib.sha256(fp.read()).hexdigest()


So with this update we will now be getting a more accurate checksum that excludes the header but includes the checksum of the rest of the file. When I merge this to Stage I expect most, if not all, lists to be updated. When this is confirmed I should talk to the ops team to let them know there may be large requests coming in.

To make use of the updated list better, so that scaling our service is done to deliver real updates and not changes to our list creation script, we may want to coordinate the changes from shavar-prod-lists along with shipping this change to prod.

publish2cloud.py

boolean5 added the Continuous Integration label Jul 27, 2020

boolean5 marked this pull request as draft July 27, 2020 16:29

boolean5 force-pushed the add-tests-for-p2c branch from af55d55 to 6daca9e Compare July 30, 2020 10:34

boolean5 marked this pull request as ready for review August 4, 2020 14:32

boolean5 mentioned this pull request Aug 13, 2020

Port script to Python 3 #163

Merged

boolean5 added 7 commits August 17, 2020 11:13

Fix bug in chunk_metadata

75ee68b

Return the checksum of the contents of the list file starting right after the chunk header. Previously, we were always starting from the 26th byte, regardless of the size of the header.

Add unit test for chunk_metadata

da3abf4

Fix flake8 error in publish2cloud.py

07ee9e8

Add unit tests for new_data_to_publish_to_s3

70952a0

Add moto as a requirement and implement unit tests for new_data_to_publish_to_s3.

Add unit tests for publish_to_s3

a5abbc9

Test S3 key permissions

028068b

Add unit tests to check that the expected S3 key permissions are set in new_data_to_publish_to_s3 and publish_to_s3. These tests are currently expected to fail because setting key permissions with add_user_grant() does not work in moto.

boolean5 force-pushed the add-tests-for-p2c branch from 5d40f71 to 028068b Compare August 17, 2020 08:24

say-yawn approved these changes Aug 17, 2020

View reviewed changes

Nit-picky code style

27db63b

say-yawn merged commit 2f0ae8d into mozilla-services:stage Aug 18, 2020

say-yawn mentioned this pull request Aug 18, 2020

Test new lists on staging mozilla-services/shavar-list-creation-config#67

Merged

HarshitSohaney mentioned this pull request Aug 14, 2024

Add unittests for code in publish2cloud.py #130

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add unit tests for publish2cloud.py #162

Add unit tests for publish2cloud.py #162

boolean5 commented Jul 27, 2020 •

edited

Loading

boolean5 commented Aug 3, 2020

boolean5 commented Aug 4, 2020

say-yawn left a comment

say-yawn Aug 17, 2020

say-yawn Aug 17, 2020

Add unit tests for publish2cloud.py #162

Add unit tests for publish2cloud.py #162

Conversation

boolean5 commented Jul 27, 2020 • edited Loading

boolean5 commented Aug 3, 2020

boolean5 commented Aug 4, 2020

say-yawn left a comment

Choose a reason for hiding this comment

say-yawn Aug 17, 2020

Choose a reason for hiding this comment

say-yawn Aug 17, 2020

Choose a reason for hiding this comment

boolean5 commented Jul 27, 2020 •

edited

Loading