Cancer diagnostics is an important field of cancer recovery and survival with many expensive procedures needed to administer the correct treatment. Machine Learning (ML) approaches can help with the diagnostic prediction from circulating tumor cells (CTCs) in liquid biopsy, or from a primary tumor in solid biopsy. After predicting the metastatic potential from a deep learning model, doctors in a clinical setting can administer a safe and correct treatment for a specific patient. This paper investigates the use of deep convolutional neural networks for predicting a specific cancer cell line as a tool for label free identification. Specifically, deep learning strategies for weight initialization and performance metrics are described, with transfer learning and the accuracy metric utilized in this work. The equipment used for prediction involve brightfield microscopy without the use of chemical labels, advanced instruments, or time-consuming biological techniques, giving an advantage over current diagnostic methods. In the procedure, three different binary datasets of well-known cancer cell lines were collected, each having a difference in metastatic potential. Two different classification models were adopted (EfficientNetV2 and ResNet-50) with the analysis given for each stage in the ML architecture. The training results for each model and dataset are provided and systematically compared. We found that the test set accuracy showed favorable performance for both ML models with EfficientNetV2 accuracy reaching up to 99%. These test results allowed EfficientNetV2 to outperform ResNet-50 at an average percent increase of 3.5% for each dataset. The high accuracy obtained from the predictions demonstrate that the system can be retrained on a large-scale clinical dataset.
Instructions (click to expand)
- First create a folder in your google drive account called "cell_classification" (This step is important in order to keep the directories in check)
- Use this link to access the shared google drive folder
- At the top there will be a dropdown arrow after the folder location (Shared with me > data_files): click on this dropdown arrow
- Click on the "Add shortcut to Drive" button then navigate to inside your ctc_classification folder and click the blue "Add Shortcut" button. This will add a shortcut to the shared google drive folder in your ctc_classification folder.
- Open the ENetV2_classifier.ipynb colab notebook from the colab badge provided above then click "Save a copy in Drive" under File > Save a copy in Drive.
- This will save the notebook in the "Colab Notebooks" folder in your google drive. Move this notebook to the ctc_classification folder and rename it ENetV2_classifier.ipynb in order for the directories to be correct.
- Do the same with the ResNet_classifier.ipynb colab notebook. The final cell_classification folder should look like this:
- You can now use the notebooks to perform more testing or contribute to the project. You can find the code written for many of the figures in the final paper: DOI Website
Testing (click to expand)
Nearly all figures and tables from the paper are outlined in ENetV2 and ResNet50 colab notebooks. First choose the dataset that you would like to investigate, e.g. the SKOV3nvsd dataset. Therefore choose the "dataset" variable as 3 because this is the fourth element in the datasets list:
Table 1 displays the annotation summary for each dataset after augmentations. This can be shown in section 2.2 of each colab notebook:
After running this cell you will get the following output:
This matches the numbers for the SKOV3drvsn data in Table 2 in the publication:
Contributions
Publication Authors:
Karl Gardner, Rutwik Joshi, Nayeem Kashem, Thanh Pham, Qiugang Lu, and Wei Li
Publication Acknowledgements:
WL acknowledge support from National Science Foundation (CBET, Grant No. 1935792) and National Institute of Health (IMAT, Grant No. 1R21CA240185-01).