GitHub - wangleihitcs/ThoraxDiseaseClassification: A multi-label-classification model for common thorax disease.

Intro

A multi-label-classification model for chest diseases.

Config

python 2.7.15
tensorflow 1.8.0
python package
- nltk
- PIL
- json
- numpy

It is all of common tookits, so I don't give their links.

DataDownload and Prepare

NIH Chest X-ray Dataset(kaggle's download link)
- you need copy 'Data_Entry_2017.csv' to dir 'data/'
- you need unzip 'images_001.zip' - 'images_012.zip' to 'data/images'
- you need copy 'train_val_list.txt' and 'test_list.txt' to 'data/'
Pretrain VGG19 model
- you need to download vgg_19_2016_08_28.tar.gz
- then extract it, copy 'vgg_19.ckpt' to 'data/pretrain_vgg/'

Train

First, preprocess data

get 'data_entry.json' and 'data_label.json'

$ cd preprocess
$ python get_data_entry.py

get 'data/tfrecord/train-xx.tfrecord', 'data/tfrecord/test-xx.tfrecord', 'train_tfrecord_name.txt' and 'test_tfrecord_name.txt'
```
$ python datasets.py    
```

Second, let's go train

you can check mlc_model.py to ensure accuracy
```
$ python main.py    
```

Test Demo

I will release a demo.py, you can use it to test.

you could provide Chest CT image to test

$ python demo.py --img='data/examples/CXR3_IM-1384-1001.png'

test demo example

Experiments

Loss

At last, I trained 100 epoch and the train mlc_loss_weighted reduce to 0.0455, it wasted 36 hours. You can see detials in 'data/log.txt'.

AUC

When epoch = 20, iter = 28000, I eval the auc. Actually, when epoch > 15, the model is overfitting, so you don't need trian too many epoch.

	Ours	Paper	test num
Effusion	0.7584	0.700	4658
Pneumothorax	0.7498	0.799	2665
Edema	0.7635	0.805	925
Cardiomegaly	0.7735	0.810	1069
Pleural_Thickening	0.7602	0.684	1143
Atelectasis	0.7532	0.700	3279
Consolidation	0.7399	0.703	1815
Emphysema	0.7385	0.833	1093
Pneumonia	0.7367	0.658	555
Nodule	0.7272	0.668	1623
Mass	0.7217	0.693	1748
Infiltration	0.7399	0.661	6112
Hernia	0.7520	0.871	86
No Finding	0.7782	-	9861
Fibrosis	0.7813	0.786	435
Mean	0.7516	0.745	-

References

Wang, Xiaosong, et al. "Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases." Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on. IEEE, 2017.
Wang, Xiaosong, et al. "Tienet: Text-image embedding network for common thorax disease classification and reporting in chest x-rays." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Intro

Config

DataDownload and Prepare

Train

First, preprocess data

Second, let's go train

Test Demo

Experiments

Loss

AUC

References

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
nets		nets
preprocess		preprocess
utils		utils
README.md		README.md
datasets.py		datasets.py
demo.py		demo.py
main.py		main.py
mlc_model.py		mlc_model.py

wangleihitcs/ThoraxDiseaseClassification

Folders and files

Latest commit

History

Repository files navigation

Intro

Config

DataDownload and Prepare

Train

First, preprocess data

Second, let's go train

Test Demo

Experiments

Loss

AUC

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages