Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU Version #83

Open
wants to merge 194 commits into
base: master
Choose a base branch
from

Conversation

ErickHernandezGutierrez
Copy link
Contributor

No description provided.

@ErickHernandezGutierrez
Copy link
Contributor Author

I merged my branch with the actual master. I disabled the use of textures to try first without textures.

@MarioOcampo MarioOcampo requested a review from fullbat October 12, 2021 07:49
Copy link
Collaborator

@fullbat fullbat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I've tried the cuda implementation on the tutorial dataset using "VolumeFractions" model specifying ndirs=1 in both trk2dictionay.run and when generating the kernel but the fitting returns all Nan as streamlines weights and the IC map is empty.
I used the compartment_IC as DWI, created in a previous run using the usual cpu version of COMMIT.
Result using COMMIT cpu:
-> Fit model:

[ 00h 00m 50s ]

-> Saving results to "Results_VolumeFractions/*":
* Fitting errors:
- RMSE... [ 0.001 +/- 0.001 ]
- NRMSE... [ 0.071 +/- 9.243 ]
* Voxelwise contributions:
- Intra-axonal... [ OK ]
- Extra-axonal... [ OK ]
- Isotropic... [ OK ]
* Configuration and results:
- streamline_weights.txt... [ OK ]
- results.pickle... [ OK ]

Result using COMMIT gpu:
-> Fit model:

[ 00h 00m 00s ]

-> Saving results to "Results_VolumeFractions/*":
* Fitting errors:
- RMSE... [ 0.425 +/- 0.196 ]
- NRMSE... [ 0.974 +/- 0.159 ]
* Voxelwise contributions:
- Intra-axonal... [ OK ]
- Extra-axonal... [ OK ]
- Isotropic... [ OK ]
* Configuration and results:
- streamline_weights.txt... [ OK ]

@ErickHernandezGutierrez
Copy link
Contributor Author

The CUDA implementation works only for CylinderZeppelinBall model. I did not try with VolumeFraction. I tried with StickZeppelinBall but there was not improvement compared with CPU. In the meanwhile, I could add a flag to check the selected model and show a an error message when a model different than CylinderZeppelinBall is selected with CUDA. Then, we can add an efficient implementation for VolumeFraction.

@daducci
Copy link
Owner

daducci commented Oct 22, 2021

How difficult would it be to enable CUDA parallelization also for other models? Als, and most importantly, how fast do you think it will be according to the current implementation?

@ErickHernandezGutierrez
Copy link
Contributor Author

It would not be so difficult to add support for VolumeFraction as it is simpler than CylinderZeppelinBall model. However, these weeks I'm a very busy with school and work until the first week of December. In the meanwhile, I could add a flag to check the selected model and make sure that CUDA is enabled only when CylinderZeppelinBall model is selected. Therefore, we would be able to submit something to ISMRM (without VolumeFraction experiments). I would prefer to wait until December to add the support for VolumeFraction properly. I think, for large datasets, we would be able to have similar speedups compared with the CylinderZeppelinBall model. But, for small datasets, the parallel CPU version is much faster because the transference of data between CPU and GPU is a bottleneck.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants