Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instances‘ meaning #1

Open
Eevan-zq opened this issue Nov 21, 2024 · 6 comments
Open

Instances‘ meaning #1

Eevan-zq opened this issue Nov 21, 2024 · 6 comments

Comments

@Eevan-zq
Copy link

Hello Sir, I would like to ask about the instance parameter that appears when defining a custom algorithm in MSCCL-TOOLS. However, I have always been unclear about its meaning.
image

When looking at your code, I noticed the statement for inst in range(ninstance): in the to_xml.py file. I suspect that ninstance represents the number of processes created for each GPU. I am not sure if my understanding is correct.
image

@liangyuRain
Copy link
Owner

The ninstance represents the number of identical communication algorithms you are simultaneously running. In GPU hardware, often you need multiple instances sending/receiving data simultaneously to fully utilize bandwidth.

@Eevan-zq
Copy link
Author

Thank you for your response. I would like to follow up by asking: what factors determine the number of instances ? For example, is it proportional to the number of threads per GPU, the number of channels, or something else?

@liangyuRain
Copy link
Owner

Usually, if you aim for throughput (for large data sizes), you want as many instances as possible. If the data size is small, then you dont want a lot of instances. In practice, different data sizes have different optimal ninstance. People usually switch schedules based on data size.

@Eevan-zq
Copy link
Author

Yes, in my understanding, an instance is equivalent to the concept of parallelism, where multiple parallel GPUs execute the collective communication primitive algorithm simultaneously, right?

@liangyuRain
Copy link
Owner

The number of GPUs in collective communication is constant regardless of ninstance. Increasing ninstance is increasing the number of parallel threadblocks or SMs executing the schedule.

@Eevan-zq
Copy link
Author

Thank you for your response. So, in the XML file called by MSCCL, under each GPU tag, there are multiple thread tags, and the value of ninstance is positively correlated with the number of these thread blocks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants