Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Steps towards Multi-GPU support / significant performance improvement #135

Open
5 tasks
slarson opened this issue Dec 19, 2017 · 6 comments
Open
5 tasks
Assignees

Comments

@slarson
Copy link
Member

slarson commented Dec 19, 2017

Aiming at best case 12x speedup for 8 GPU cluster

  • Run GPUSPH on target hardware and find out what is the performance of this software in comparison with Sibernetic -- compare running of 100K particles
  • Devise a strategy to take the key components that enable the multi-GPU support in GPUSPH.
    • CUDA based?
    • OpenCL based?

Also examine if significant performance improvement can be had via other routes:

  • Scale & timestep-based?
@slarson
Copy link
Member Author

slarson commented Dec 27, 2017

Initial investigations into modifying scale and timestep seem to be demonstrating it is possible to improve performance significantly. Next step on this is to adjust viscosity, elasticity, and surface tension to enable a scaled up worm to retain low-Reynolds number behavior. We are noting that we also need to keep an eye on integration error as we do this.

On the GPUSPH side, initial results were that for a 13K particle simulation, performance actually degrades when put across multiple nodes. We are suggesting to try much larger simulations of 100K-200K particles to compare.

@skhayrulin is pursuing an algorithm for synchronizing across multiple computing devices in sibernetic via openCL

@slarson
Copy link
Member Author

slarson commented Jan 9, 2018

@skhayrulin has finished with an abstract algorithm for distribution for particles across 2 devices but still needs to test across more devices.

@a-palyanov Is continuing to experiment with combinations of physical parameters that allow the time step to be increased and maintain physical consistency. He has achieved 1 second of simulation time for 20 minutes of compute time (!!). We are working to get this version out ASAP.

@slarson
Copy link
Member Author

slarson commented Feb 9, 2018

After some success seeing linearly scaling performance on the multi-CPU set up, We are still having a look at https://github.com/DualSPHysics/DualSPHysics to see if they will implement Multi GPU support.

@VahidGh
Copy link
Member

VahidGh commented Feb 9, 2018 via email

@slarson
Copy link
Member Author

slarson commented Feb 11, 2018

@skhayrulin Any luck building DualSPHysics?

@skhayrulin
Copy link
Member

@slarson I've finally compiled DualSPHysics on my machine with one NVIDIA GPU and run it unfortunately it finished with segmentation fault error but anyway It compiled and run at least. I can continue to dig further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants