-
Notifications
You must be signed in to change notification settings - Fork 300
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* wip - Implement offloading of literals Signed-off-by: Eduardo Apolinario <[email protected]> * Fix use of metadata bucket prefix Signed-off-by: Eduardo Apolinario <[email protected]> * Fix repeated use of uri Signed-off-by: Eduardo Apolinario <[email protected]> * Add temporary representation for offloaded literal Signed-off-by: Eduardo Apolinario <[email protected]> * Add one unit test Signed-off-by: Eduardo Apolinario <[email protected]> * Add another test Signed-off-by: Eduardo Apolinario <[email protected]> * Stylistic changes to the two tests Signed-off-by: Eduardo Apolinario <[email protected]> * Add test for min offloading threshold set to 1MB Signed-off-by: Eduardo Apolinario <[email protected]> * Pick a unique engine-dir for tests Signed-off-by: Eduardo Apolinario <[email protected]> * s/new_outputs/literal_map_copy/ Signed-off-by: Eduardo Apolinario <[email protected]> * Remove unused constant Signed-off-by: Eduardo Apolinario <[email protected]> * Use output_prefix in definition of offloaded literals Signed-off-by: Eduardo Apolinario <[email protected]> * Add initial version of pbhash.py Signed-off-by: Eduardo Apolinario <[email protected]> * Add tests to verify that overriding the hash is carried over to offloaded literals Signed-off-by: Eduardo Apolinario <[email protected]> * Add a few more tests Signed-off-by: Eduardo Apolinario <[email protected]> * Always import ParamSpec from `typing_extensions` Signed-off-by: Eduardo Apolinario <[email protected]> * Fix lint warnings Signed-off-by: Eduardo Apolinario <[email protected]> * Set inferred_type using the task type interface Signed-off-by: Eduardo Apolinario <[email protected]> * Add comment about offloaded literals files and how they are uploaded to the metadata bucket Signed-off-by: Eduardo Apolinario <[email protected]> * Add offloading_enabled Signed-off-by: Eduardo Apolinario <[email protected]> * Add more unit tests including a negative test Signed-off-by: Eduardo Apolinario <[email protected]> * Fix bad merge Signed-off-by: Eduardo Apolinario <[email protected]> * Incorporate feedback. Signed-off-by: Eduardo Apolinario <[email protected]> * Fix image name (unrelated to this PR - just a nice-to-have to decrease flakiness) Signed-off-by: Eduardo Apolinario <[email protected]> * Add `is_map_task` to `_dispatch_execute` Signed-off-by: Eduardo Apolinario <[email protected]> --------- Signed-off-by: Eduardo Apolinario <[email protected]> Co-authored-by: Eduardo Apolinario <[email protected]> Signed-off-by: Eduardo Apolinario <[email protected]>
- Loading branch information
1 parent
1e7306c
commit 4ebeae5
Showing
9 changed files
with
679 additions
and
18 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
# This is a module that provides hashing utilities for Protobuf objects. | ||
import base64 | ||
import hashlib | ||
import json | ||
|
||
from google.protobuf import json_format | ||
from google.protobuf.message import Message | ||
|
||
|
||
def compute_hash(pb: Message) -> bytes: | ||
""" | ||
Computes a deterministic hash in bytes for the Protobuf object. | ||
""" | ||
try: | ||
pb_dict = json_format.MessageToDict(pb) | ||
# json.dumps with sorted keys to ensure stability | ||
stable_json_str = json.dumps( | ||
pb_dict, sort_keys=True, separators=(",", ":") | ||
) # separators to ensure no extra spaces | ||
except Exception as e: | ||
raise ValueError(f"Failed to marshal Protobuf object {pb} to JSON with error: {e}") | ||
|
||
try: | ||
# Deterministically hash the JSON object to a byte array. Using SHA-256 for hashing here, | ||
# assuming it provides a consistent hash output. | ||
hash_obj = hashlib.sha256(stable_json_str.encode("utf-8")) | ||
except Exception as e: | ||
raise ValueError(f"Failed to hash JSON for Protobuf object {pb} with error: {e}") | ||
|
||
# The digest is guaranteed to be 32 bytes long | ||
return hash_obj.digest() | ||
|
||
|
||
def compute_hash_string(pb: Message) -> str: | ||
""" | ||
Computes a deterministic hash in base64 encoded string for the Protobuf object | ||
""" | ||
hash_bytes = compute_hash(pb) | ||
return base64.b64encode(hash_bytes).decode("utf-8") |
Oops, something went wrong.