Instances‘ meaning #1

Eevan-zq · 2024-11-21T01:48:00Z

Hello Sir, I would like to ask about the instance parameter that appears when defining a custom algorithm in MSCCL-TOOLS. However, I have always been unclear about its meaning.

When looking at your code, I noticed the statement for inst in range(ninstance): in the to_xml.py file. I suspect that ninstance represents the number of processes created for each GPU. I am not sure if my understanding is correct.

liangyuRain · 2024-11-22T07:33:34Z

The ninstance represents the number of identical communication algorithms you are simultaneously running. In GPU hardware, often you need multiple instances sending/receiving data simultaneously to fully utilize bandwidth.

Eevan-zq · 2024-11-22T08:03:45Z

Thank you for your response. I would like to follow up by asking: what factors determine the number of instances ? For example, is it proportional to the number of threads per GPU, the number of channels, or something else?

liangyuRain · 2024-11-26T07:58:16Z

Usually, if you aim for throughput (for large data sizes), you want as many instances as possible. If the data size is small, then you dont want a lot of instances. In practice, different data sizes have different optimal ninstance. People usually switch schedules based on data size.

Eevan-zq · 2024-11-26T08:02:55Z

Yes, in my understanding, an instance is equivalent to the concept of parallelism, where multiple parallel GPUs execute the collective communication primitive algorithm simultaneously, right?

liangyuRain · 2024-11-26T08:04:40Z

The number of GPUs in collective communication is constant regardless of ninstance. Increasing ninstance is increasing the number of parallel threadblocks or SMs executing the schedule.

Eevan-zq · 2024-11-26T08:11:42Z

Thank you for your response. So, in the XML file called by MSCCL, under each GPU tag, there are multiple thread tags, and the value of ninstance is positively correlated with the number of these thread blocks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Instances‘ meaning #1

Instances‘ meaning #1

Eevan-zq commented Nov 21, 2024

liangyuRain commented Nov 22, 2024

Eevan-zq commented Nov 22, 2024

liangyuRain commented Nov 26, 2024

Eevan-zq commented Nov 26, 2024

liangyuRain commented Nov 26, 2024

Eevan-zq commented Nov 26, 2024

Instances‘ meaning #1

Instances‘ meaning #1

Comments

Eevan-zq commented Nov 21, 2024

liangyuRain commented Nov 22, 2024

Eevan-zq commented Nov 22, 2024

liangyuRain commented Nov 26, 2024

Eevan-zq commented Nov 26, 2024

liangyuRain commented Nov 26, 2024

Eevan-zq commented Nov 26, 2024