Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM downloading large files #10286

Open
max-degterev opened this issue Dec 31, 2024 · 2 comments
Open

OOM downloading large files #10286

max-degterev opened this issue Dec 31, 2024 · 2 comments
Labels
status: needs-triage Possible bug which hasn't been reproduced yet

Comments

@max-degterev
Copy link

max-degterev commented Dec 31, 2024

Describe the Bug

I have an upcoming task uploading/downloading large files and it seems Payload will go OOM. Would it be possible to stream data to/from S3 instead of buffering?

Downloads are as easy as passing object.Body.transformToWebStream() straight to new Response(), not sure what the lift is on uploads but I assume skipping buffering for non image files should be feasible?

Alternatively would it be possible to expose storageClient: https://github.com/payloadcms/payload/blob/main/packages/storage-s3/src/index.ts#L108 so that userland code can make a custom call without the need to install the same dependencies and create another instance.

Link to the code that reproduces this issue

https://github.com/payloadcms/payload/blob/main/packages/storage-s3/src/staticHandler.ts#L53

Reproduction Steps

Use s3Adapter. Try to upload/download 1GB file on a system with 512MB RAM

Which area(s) are affected? (Select all that apply)

plugin: cloud-storage

Environment Info

v3.12
@max-degterev max-degterev added status: needs-triage Possible bug which hasn't been reproduced yet validate-reproduction labels Dec 31, 2024
@max-degterev
Copy link
Author

Hackfix:

import path from 'path';
import * as AWS from '@aws-sdk/client-s3';
import { secureConfig } from '@/config/cloud';
import type { File as S3File } from '@/payload-types';

let secureClient: AWS.S3 | null = null;
const getStorageClient: () => AWS.S3 = () => {
  if (!secureClient) secureClient = new AWS.S3(secureConfig.config);
  return secureClient;
};

export const getSecureFile = async (file: S3File) => getStorageClient().getObject({
  Bucket: secureConfig.bucket,
  Key: path.posix.join(file.prefix!, file.filename!),
});

// somewhere in handlers:
      const s3Response = await getSecureFile(fileData);

      if (!s3Response.Body) return getErrorResponse(ERROR_NOT_FOUND);

      const stream = s3Response.Body.transformToWebStream();

      return new Response(stream, {
        headers: {
          'Content-Disposition': contentDisposition(fileData.filename!),
          'Content-Type': s3Response.ContentType!,
        },
      });

@max-degterev
Copy link
Author

Actually come to think of it sharp supports streaming so there are easy performance gains here by avoiding buffering, less memory consumed and less time to process by piping resized upload directly to S3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: needs-triage Possible bug which hasn't been reproduced yet
Projects
None yet
Development

No branches or pull requests

1 participant