Hermán Judit (L7D38R)
Kovács Kíra Diána (CXL05B)
We want to detect types of Pneumonia based on chest X-ray images. We found a dataset for it on Kaggle (https://www.kaggle.com/datasets/tolgadincer/labeled-chest-xray-images/data), so we didn't have to gather data from other places. The dataset contain pictures of normal (healthy) lungs, and pictures of bacterial and virus-caused pneumonia attacked lungs. The Kaggle competition was to classificate these images into healthy and sick patients, so the task was a simple binary classification, but we will want to do a 3-class classification. We can do this, because the pictures' file names contain the type of pneumonia it is (bacterial or virus-caused).
We want to investigate the topic further, than the participants of the competition, we will take a look at what models they used, but will investigate other methods too, to see, what model could perform the best on this dataset.
melytanulas_bhw.ipynb - This notebook contains the main part of the project. -- 10.13: First milestone, with data analysis, preparation, loaders
Milestone2.ipynb - The notebook contains all the code and requirement work for the 2nd milestone.
Final.ipynb - This notebook has the final code for the project.
Documentation.pdf - This document contains the documentation of our project.
Since the dataset is on Kaggle related to a data competition, we took a look to the other uploaded codes, so we could gather new ideas.
Papers: (we used them in the documentation too)
- Aimina Ali Eli and Abida Ali. Deep Learning Applications in Medical Image Analysis: Advancements, Challenges, and Future Directions. arXiv, 2024.
- Li, Mengfang and Jiang, Yuanyuan and Zhang, Yanzhou and Zhu, Haisheng. Medical image analysis using deep learning algorithms. Frontiers in Public Health, 2023.
- Ker, Justin and Wang, Lipo and Rao, Jai and Lim, Tchoyoson. Deep Learning Applications in Medical Image Analysis. IEEE Access, 2018.
We used Kaggle to run the notebook, because this way we didn't have to download the 1GB dataset. So the easiest way to run the notebook is to upload the notebook to Kaggle and run the codes there. An other way is to download the dataset to your own machine and rewrite the file path. If you would like to, we can share with you the notebook on Kaggle, so you can just rerun it there, please send us the email address of the required personnel and we can grant access.
We trained a simple convolutional network as a baseline model and evaluated its performance in the same notebook on Kaggle. The uploaded Milestone2.ipynb file contains the training and evaluation (you can upload and run it on Kaggle).