Skip to content

In this repository, identification of melanoma, a type of skin cancer, will be carried out from the lesion images with the help of ensembling of neural networks. Also a number of preprocessing techniques as well as data visualization and data exploration tactics will be explored on the dataset.

License

Notifications You must be signed in to change notification settings

PrasunDatta/SIIM-ISIC-Melanoma-Classification_2020

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Corresponding Kaggle Notebook Link

https://www.kaggle.com/code/prasundatta/melanoma-classification-data-visualization-eda

Description

Skin cancer is the most prevalent type of cancer. Melanoma, specifically, is responsible for 75% of skin cancer deaths, despite being the least common skin cancer. The American Cancer Society estimates over 100,000 new melanoma cases will be diagnosed in 2020. It's also expected that almost 7,000 people will die from the disease. As with other cancers, early and accurate detection—potentially aided by data science—can make treatment more effective.

Currently, dermatologists evaluate every one of a patient's moles to identify outlier lesions or “ugly ducklings” that are most likely to be melanoma. Existing AI approaches have not adequately considered this clinical frame of reference. Dermatologists could enhance their diagnostic accuracy if detection algorithms take into account “contextual” images within the same patient to determine which images represent a melanoma. If successful, classifiers would be more accurate and could better support dermatological clinic work.

In this competition, you’ll identify melanoma in images of skin lesions. In particular, you’ll use images within the same patient and determine which are likely to represent a melanoma. Using patient-level contextual information may help the development of image analysis tools, which could better support clinical dermatologists.

Melanoma is a deadly disease, but if caught early, most melanomas can be cured with minor surgery. Image analysis tools that automate the diagnosis of melanoma will improve dermatologists' diagnostic accuracy. Better detection of melanoma has the opportunity to positively impact millions of people.

Dataset Description

The images are provided in DICOM format. This can be accessed using commonly-available libraries like pydicom, and contains both image and metadata. It is a commonly used medical imaging data format.

Images are also provided in JPEG and TFRecord format (in the jpeg and tfrecords directories, respectively). Images in TFRecord format have been resized to a uniform 1024x1024.

Metadata is also provided outside of the DICOM format, in CSV files. See the Columns section for a description.

What am I predicting?

You are predicting a binary target for each image. Your model should predict the probability (floating point) between 0.0 and 1.0 that the lesion in the image is malignant (the target). In the training data, train.csv, the value 0 denotes benign, and 1 indicates malignant.

Files

  • train.csv => the training set
  • test.csv => the test set

Columns

  • image_name - unique identifier, points to filename of related DICOM image
  • patient_id - unique patient identifier
  • sex - the sex of the patient (when unknown, will be blank)
  • age_approx - approximate patient age at time of imaging
  • anatom_site_general_challenge - location of imaged site
  • diagnosis - detailed diagnosis information (train only)
  • benign_malignant - indicator of malignancy of imaged lesion
  • target - binarized version of the target variable

About

In this repository, identification of melanoma, a type of skin cancer, will be carried out from the lesion images with the help of ensembling of neural networks. Also a number of preprocessing techniques as well as data visualization and data exploration tactics will be explored on the dataset.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published