Skip to content

Conversation

S7evinK
Copy link
Contributor

@S7evinK S7evinK commented Apr 16, 2024

@anoadragon453:

This PR adds Pydantic model-based validation to the POST /_matrix/client/v3/keys/upload endpoint. This prevents invalid request bodies from reaching the handler function and producing internal server errors.

Requires #18996 before unit tests will pass.


I initially wanted to transform the request body model into some attrs-based domain objects, rather than the bare dicts we're passing around internally today. Alas, this made the diff explode in size, so I've reverted to just making use of the bodys raw dict. It is validated though! The main issue was that deeper in the stack we attempt to encode a portion of the dict to canonical JSON, and trying to do this with attrs objects was a nightmare.

I had also hoped to include a new UserIDType in the Pydantic model, instead of using StrictStr. But this caused a lot of faffing about with converting UserID to str and back, and we don't even end up pulling any data out of the pydantic model anyhow. So I decided to ditch that.

@S7evinK S7evinK requested a review from a team as a code owner April 16, 2024 15:00
@S7evinK S7evinK requested a review from a team November 22, 2024 08:51
@CLAassistant
Copy link

CLAassistant commented Mar 23, 2025

CLA assistant check
All committers have signed the CLA.

anoadragon453 added a commit that referenced this pull request Sep 29, 2025
As we are now well past Synapse 1.135. This was originally added in #17097
I really wanted to use a `UserID` instead of a `StrictStr` for the fields that contain a user ID. But this became too
cumbersome due to the handler code wanting to directly json-encode the request body. As `UserID` does not subclass
`str`, one would have to rebuild the entire containing object in order to json-encode it. Perhaps a future PR will do this,
as it would allow us to validate UserID's more easily at the edge.
Some extra validation of the request body.
@anoadragon453 anoadragon453 force-pushed the s7evink/validate-upload-keys-dict branch from c43ca16 to 0eaf28f Compare September 30, 2025 17:33
@anoadragon453 anoadragon453 requested a review from a team September 30, 2025 17:34
@anoadragon453
Copy link
Member

anoadragon453 commented Sep 30, 2025

This PR is now ready for re-review and has been rewritten using Pydantic to validate user input.

Trial tests failing are expected until #18996 is merged.

@anoadragon453 anoadragon453 changed the title Ensure that uploaded keys are dicts Validate the body of requests to /keys/upload Oct 1, 2025
# storing the result in the DB. There's little point in converted to a
# parsed object and then back to a dict.
body = parse_json_object_from_request(request)
validate_json_object(body, self.KeyUploadRequestBody)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use parse_and_validate_json_object_from_request(...) and pass in a concrete type to self.e2e_keys_handler.upload_keys_for_user(...)

Parse, don't validate (we shouldn't lose the type data after sussing it out)

Copy link
Contributor

@MadLittleMods MadLittleMods Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this was already explained in the comment above:

# Parse the request body. Validate separately, as the handler expects a
# plain dict, rather than any parsed object.
#
# Note: It would be nice to work with a parsed object, but the handler
# needs to encode portions of the request body as canonical JSON before
# storing the result in the DB. There's little point in converted to a
# parsed object and then back to a dict.

Perhaps pragmatic and better than before so we can move forward with it ⏩


As a break-down of what upload_keys_for_user(...) does with the data:

  • upload_keys_for_user -> upload_device_keys_for_user -> encode_canonical_json
  • upload_keys_for_user -> _upload_one_time_keys_for_user, we manually iterate and re-encode anyway so a parsed object would be good here
  • upload_keys_for_user -> set_e2e_fallback_keys -> encode_canonical_json

Pydantic does support serialization so it wouldn't be that awkward if we just used a parsed object until we needed to serialize for encode_canonical_json.

If we want to avoid the deserialization/serialization, we could use TypedDict for these keys in the parsed object 🤔

Comment on lines +273 to +281
if "device_keys" in body:
# Validate the provided `user_id` and `device_id` fields in
# `device_keys` match that of the requesting user. We can't do
# this directly in the pydantic model as we don't have access
# to the requester yet.
#
# TODO: We could use ValidationInfo when we switch to Pydantic v2.
# https://docs.pydantic.dev/latest/concepts/validators/#validation-info
if body["device_keys"]["user_id"] != user_id:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add a check/validate methods to KeyUploadRequestBody for this logic

Plays into wanting to use parse_and_validate_json_object_from_request(...) as well

if not isinstance(v, dict):
raise TypeError("fallback_keys must be a mapping")

for k, _ in v.items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for k, _ in v.items():
for k in v.keys():

Might as well


one_time_keys: Optional[Mapping[StrictStr, Union[StrictStr, KeyObject]]] = None
"""
One-time public keys for “pre-key” messages. The names of the properties
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
One-time public keys for pre-key messages. The names of the properties
One-time public keys for "pre-key" messages. The names of the properties

Weird quotes 🤷

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants