Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about Mountpoint Client Performance #236

Closed
1 task done
ryxli opened this issue Sep 21, 2024 · 5 comments
Closed
1 task done

Question about Mountpoint Client Performance #236

ryxli opened this issue Sep 21, 2024 · 5 comments
Labels
bug Something isn't working wontfix This will not be worked on

Comments

@ryxli
Copy link
Contributor

ryxli commented Sep 21, 2024

s3torchconnector version

latest

s3torchconnectorclient version

latest

AWS Region

us-east-1, ap-south-1

Describe the running environment

EC2 instance

What happened?

Getting some significant performance difference between regular boto3 download_obj which uses s3 crt transfer config and mountpoint client, even with various settings for throughput and part size. boto3 client just use the default settings

To reproduce try downloading a 2GB file from s3 with mountpoint client (S3Reader) vs regular boto3 client.

# s3 torch connector snippet
s3_reader = S3Reader(
    bucket,
    key,
    get_object_info=get_object_info,
    get_stream=partial(self._get_object_stream, bucket, key),
)
start = time.time()
s3_reader.read()
print(f"mountpoint finish in {time.time() - start}")

# boto3 snippet
client = boto3.client('s3')
start = time.time()
client.download_file(bucket, key, test_path)
print(f"boto3 finish in {time.time() - start}")

Results:

boto3 finish in 4.756349802017212

THROUGHPUT_GBPS=400 PART_SIZE=8MB mountpoint finish in 11.611842155456543
THROUGHPUT_GBPS=400 PART_SIZE=16MB mountpoint finish in 11.204271793365479
THROUGHPUT_GBPS=400 PART_SIZE=32MB mountpoint finish in 14.50656008720398
THROUGHPUT_GBPS=400 PART_SIZE=64MB mountpoint finish in 14.8595449924469
THROUGHPUT_GBPS=400 PART_SIZE=128MB mountpoint finish in 16.200087547302246
THROUGHPUT_GBPS=200 PART_SIZE=8MB mountpoint finish in 11.326915740966797
THROUGHPUT_GBPS=200 PART_SIZE=16MB mountpoint finish in 11.493544578552246
THROUGHPUT_GBPS=200 PART_SIZE=32MB mountpoint finish in 14.47541093826294
THROUGHPUT_GBPS=200 PART_SIZE=64MB mountpoint finish in 14.830240249633789
THROUGHPUT_GBPS=200 PART_SIZE=128MB mountpoint finish in 16.252891063690186
THROUGHPUT_GBPS=100 PART_SIZE=8MB mountpoint finish in 11.260882139205933
THROUGHPUT_GBPS=100 PART_SIZE=16MB mountpoint finish in 11.112053871154785
THROUGHPUT_GBPS=100 PART_SIZE=32MB mountpoint finish in 14.596931219100952
THROUGHPUT_GBPS=100 PART_SIZE=64MB mountpoint finish in 14.992492437362671
THROUGHPUT_GBPS=100 PART_SIZE=128MB mountpoint finish in 16.195728063583374
THROUGHPUT_GBPS=50 PART_SIZE=8MB mountpoint finish in 10.574751377105713
THROUGHPUT_GBPS=50 PART_SIZE=16MB mountpoint finish in 10.638182163238525
THROUGHPUT_GBPS=50 PART_SIZE=32MB mountpoint finish in 14.243337154388428
THROUGHPUT_GBPS=50 PART_SIZE=64MB mountpoint finish in 14.779768705368042
THROUGHPUT_GBPS=50 PART_SIZE=128MB mountpoint finish in 16.239811897277832

However, this performance gap seems to disappear in multiprocess setting, but again without any tuning on the transferconfig for boto3

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@ryxli ryxli added the bug Something isn't working label Sep 21, 2024
@matthieu-d4r
Copy link
Contributor

matthieu-d4r commented Sep 23, 2024

Hello @ryxli, thank you for opening up that issue.

I can't seem to reproduce the performance downgrade you're observing: would you mind sharing more details? Namely, your instance type and Python + boto3 + s3torchconnector versions?

Here's what I tried:

  1. Create an EC2 instance (AMI: Deep Learning OSS Nvidia Driver AMI GPU PyTorch 2.3.0 (Amazon Linux 2) 20240825, instance type: g4dn.2xlarge) in ap-south-1
  2. Create an S3 bucket in ap-south-1, and upload a 2 Gb file in it
  3. ssh into the EC2 instance
  4. Create a Python venv, and run pip install s3torchconnector s3torchconnectorclient boto3 numpy
  5. Execute the following script:
import time

import boto3
from s3torchconnector._s3client import S3Client


def issue236():
    bucket = "my_bucket"
    key = "large_2gb"

    # s3 torch connector snippet
    s3_client = S3Client("ap-south-1")
    tic = time.perf_counter()
    s3_client.get_object(bucket, key).read()
    toc = time.perf_counter()
    print(f"mountpoint finishes in {toc - tic:0.4f} seconds")

    # boto3 snippet
    client = boto3.client('s3')
    tic = time.perf_counter()
    client.download_file(bucket, key, 'my_large_2gb')
    toc = time.perf_counter()
    print(f"boto3 finishes in {toc - tic:0.4f} seconds")


if __name__ == "__main__":
    issue236()

Overall, the PyTorch connector runs consistently faster than boto3 (example run):

mountpoint finishes in 4.7542 seconds
boto3 finishes in 5.7658 seconds

Finally, here are the versions used for this test:

s3torchconnector         1.2.5
s3torchconnectorclient   1.2.5
torch                    2.4.1
boto3                    1.35.24

@ryxli
Copy link
Contributor Author

ryxli commented Sep 24, 2024

@matthieu-d4r

Am still able to reproduce this issue with your code snippet, this time with a 6GB object.

S3 Bucket region: us-east-1
Ec2 region: ap-south-1

import time  
  
import boto3  
from s3torchconnector._s3client import S3Client  
  
  
def issue236():  
    # s3 torch connector snippet  
    s3_client = S3Client("us-east-1")  
    tic = time.perf_counter()  
    s3_client.get_object(bucket, key).read()  
    toc = time.perf_counter()  
    print(f"mountpoint finishes in {toc - tic:0.4f} seconds")  
  
    # boto3 snippet  
    client = boto3.client('s3')  
    tic = time.perf_counter()  
    client.download_file(bucket, key, 'my_large_6gb')  
    toc = time.perf_counter()  
    print(f"boto3 finishes in {toc - tic:0.4f} seconds")  

issue236()

Output:

mountpoint finishes in 27.9438 seconds
boto3 finishes in 19.0174 seconds

Versions:

import s3torchconnector
import s3torchconnectorclient
import torch
import boto3

print(s3torchconnector.__version__)
print(s3torchconnectorclient.__version__)
print(torch.__version__)
print(boto3.__version__)
1.2.5
1.2.5
2.3.0a0+6ddf5cf85e.nv24.04
1.35.20

boto3 is installed with pip install 'boto3[crt]'

@matthieu-d4r
Copy link
Contributor

Hi @ryxli,

I ran the snippet again too, against a bucket in a different region (same setup as you: EC2 instance in ap-south-1 and S3 bucket in us-east-1), and still no performance degradation; I also installed Boto3 with pip install boto3[crt].

One question though: I noticed in your PyTorch version an unusual number (2.3.0a0+6ddf5cf85e.nv24.04), which I found referenced in https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-24-04.html: are you running this snippet from within a container? Or are you executing it on a "raw" EC2 instance (i.e., sshed in and nothing else)?

@ryxli
Copy link
Contributor Author

ryxli commented Sep 25, 2024

I am running this snippet from within a container on the ec2 instance, also from a Jupyter notebook

@matthieu-d4r
Copy link
Contributor

Hi @ryxli,

As discussed offline with you, we'll proceed to close this issue for now, as we were unable to reproduce the problem.

@matthieu-d4r matthieu-d4r closed this as not planned Won't fix, can't repro, duplicate, stale Oct 9, 2024
@matthieu-d4r matthieu-d4r added the wontfix This will not be worked on label Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants