Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a field to limit the size of uploading content #1701

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

git-hyagi
Copy link
Contributor

closes: #532

Copy link
Member

@lubosmj lubosmj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the goal to read blob data in smaller chunks and in case we are blocking the API for too long, we raise an error?

while True:
subchunk = chunk.read(2000000)
if not subchunk:
break
temp_file.write(subchunk)
size += len(subchunk)
for algorithm in Artifact.DIGEST_FIELDS:
hashers[algorithm].update(subchunk)

I see a threshold defined for config blobs: https://github.com/containers/image/blob/1dbd8fbbe51653e8a304122804431b07a1060d06/internal/image/oci.go#L63-L83. But, I could not find any thresholds for regular blobs. Can we investigate this?

@@ -938,6 +942,10 @@ def put(self, request, path, pk=None):
repository.pending_blobs.add(blob)
return BlobResponse(blob, path, 201, request)

def _verify_payload_size(self, distribution, chunk):
if distribution.max_payload_size and chunk.reader.length > distribution.max_payload_size:
raise PayloadTooLarge()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this exception properly rendered to the container's API format?

ref:

def handle_exception(self, exc):

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the exceptions are properly documented at https://docker-docs.uclv.cu/registry/spec/api/#blob-upload.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahn... my bad, I didn't know about the spec for errors.
I also found this doc https://github.com/opencontainers/distribution-spec/blob/v1.0.1/spec.md#error-codes that will be helpful.

@@ -751,6 +751,8 @@ class ContainerDistribution(Distribution, AutoAddObjPermsMixin):
null=True,
)

max_payload_size = models.IntegerField(null=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think we want to make this parameter configurable. I vote for having a constant for this.

@git-hyagi
Copy link
Contributor Author

Was the goal to read blob data in smaller chunks and in case we are blocking the API for too long, we raise an error?

Hum... when I was writing this PR I was thinking about a feature to allow a maximum size for blob layers and deny it if the limit is exceeded.
Did I misunderstand the goal of the issue?

I see a threshold defined for config blobs: https://github.com/containers/image/blob/1dbd8fbbe51653e8a304122804431b07a1060d06/internal/image/oci.go#L63-L83. But, I could not find any thresholds for regular blobs. Can we investigate this?

I also couldn't find a limit for regular blobs. From what I could understand, they limit only the size of buffered "resources" (mainifests, config blobs, signatures, etc.) to avoid OOM issues:
containers/image@61096ab
"Restrict the sizes of blobs which are copied into memory such as the
manifest, the config, signatures, etc. This will protect consumers of
c/image from rogue or hijacked registries that return too big blobs in
hope to cause an OOM DOS attack."

@lubosmj
Copy link
Member

lubosmj commented Jul 18, 2024

Okay, I think we can follow that path. Let's then focus on manifests, config blobs, and signatures exclusively.

@git-hyagi
Copy link
Contributor Author

After an internal discussion, here is the conclusion we got:
Considering that a proper installation of Pulp should have a reverse proxy and if an upload request is too big, the damage (denial of service) on the reverse proxy is already done, we should define the limits on the reverse proxy, not on Pulp.
Here is a draft of a nginx config to limit the manifest size:

location ~* /v2/.*/manifests/.*$ {
	client_max_body_size 1m;
}

@git-hyagi git-hyagi marked this pull request as draft July 23, 2024 14:11
@ipanova
Copy link
Member

ipanova commented Aug 14, 2024

This change makes sense to me, however besides manifest directive we need to also cap extensions/v2/(?P<path>.+)/signatures endpoint. This is an alternative way to upload skopeo produced atomic signatures.

Config blobs are left out but i don't see an easy way to set 4mb limit specifically on them given that they are in location ~* /v2/.*/blobs/.*$ I would be ok to proceed only with signatures and manifests.

One thing I do not understand, if the malicious user can create big manifest that would lead to big memory consumption, what prevents him to just create big blob? There is no limit set on the blobs.

@ipanova
Copy link
Member

ipanova commented Aug 14, 2024

Skimming through commit description containers/image@61096ab changes, they are mostly targeted on the client side, so the client during pull operation is not susceptible to DDoS attack.
This reads to me as registry itself does not have limits on upload. If that's the case, than the only place where we need to introduce a safeguard is sync operation. Because here we act as a client so would be good to read only first 4mb of the manifests/config blobs/signatures and then raise an error.

@ipanova
Copy link
Member

ipanova commented Aug 14, 2024

Hm, apparently the changes were applied also on some upload calls like https://github.com/containers/image/blob/61096ab72530cc9216b50d9fc3bee90b53704f68/docker/docker_image_dest.go#L630

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

As a user, there is a limit to the size of content that is accepted to be retrieved via live api
3 participants