Skip to content
This repository has been archived by the owner on Aug 18, 2020. It is now read-only.

Medical imaging tutorial #172

Merged
merged 21 commits into from
Mar 15, 2020

Conversation

moritzschwyzer
Copy link
Contributor

I created a medical imaging tutorial that shows how to work with X-ray DICOM data. For this matter, I updated the load_image function in vision.core to be able to load DICOM files using pydicom. For the tutorial, I created a small subset of the SIIM-ACR Pneumothorax Segmentation dataset with 250 DICOM files and a .csv file with labels. I informed the SIIM and they gave consent to create this subset and put it online. I'm currently hosting the .tgz file on a server of mine, but I think it would be a good idea to put it on a CDN together with the other datasets. I can forward to you the email with the permission statement from the SIIM to use their data if needed.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

You'll be able to see Jupyter notebook diff and discuss changes. Powered by ReviewNB.

@sgugger
Copy link
Contributor

sgugger commented Mar 15, 2020

I would very much like vision.core to stay independent from pydicom to avoid the dependency on the main library (ultimately, fastai.medical will be a real submodule). In medical.imaging, you have a patch to Path objects to implement a dcmread method for opening them that could be useful.

We can host the tgz with our other datasets if the SIIM is okay with that.

@moritzschwyzer
Copy link
Contributor Author

Hi Sylvain, yes I totally agree, it's much cleaner to keep it separately. I'd really like to use the vision functionality (data augmentation etc.), though. Do you have a recommendation how to integrate the DICOM loading in the most efficient way and still being able to use the vision functionality?

@sgugger
Copy link
Contributor

sgugger commented Mar 15, 2020

I see you're using cls=PILImageBW in your data block. Why not create your own subclass of it, named DicomImageBW. You can have the code you used to load inside the .create method and it should work as is since you would still have the behavior of PILImageBW everywhere.

@moritzschwyzer
Copy link
Contributor Author

Thanks a lot for your input, that was very helpful! I now created a subclass PILDicom based on PILBase. Additionally, I modified the dcmread function so that it can be used to in the PILDicom.create function. I made the according changes in the tutorial. How do you want to proceed with the small SIIM dataset? Will you download it from http://files.vedavimedical.com/siim_small.tgz and put it on your cdn?

@review-notebook-app
Copy link

review-notebook-app bot commented Mar 15, 2020

View / edit / reply to this conversation on ReviewNB

sgugger commented on 2020-03-15T19:38:09Z
----------------------------------------------------------------

Could we hide the output of this cell by storing the predictions in some variable? There is no real point seeing it.


moritzschwyzer commented on 2020-03-15T19:42:54Z
----------------------------------------------------------------

Sure! I just pushed a new version.

@sgugger
Copy link
Contributor

sgugger commented Mar 15, 2020

Thanks a lot! Two things: I think we can hide one output in your notebook. And the second is I'm not sure you should change the dcmread command as PIL does not deal with int16 images AFAIK. All of this should be in your PILDicom only.

@moritzschwyzer
Copy link
Contributor Author

I just found this closed issue pytorch/vision#105 that states that PIL handles int16 grayscale images. I now reverted the dcmread function to the old version and use the PIL conversion in the PILDicom.

@sgugger
Copy link
Contributor

sgugger commented Mar 15, 2020

Thanks for making the changes. This is looking pretty great :)

@sgugger sgugger merged commit 430479f into fastai:master Mar 15, 2020
@sgugger
Copy link
Contributor

sgugger commented Mar 16, 2020

One follow up: the image you put as an attachment in the notebook did not arrive on GitHub. Can you put it in the images directory in another PR? Thanks!
I'll deal with the dataset today and adjust the url when I've put it on out server.

@sgugger
Copy link
Contributor

sgugger commented Mar 16, 2020

Oh it is there properly, just not working withour doc building. Sorry, no need to do anything, will fix manually :)

@moritzschwyzer moritzschwyzer deleted the medical_imaging_tutorial branch March 16, 2020 10:50
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants