The picture below shows one example from each class. Please note that this dataset is extremely unbalanced. The training set consists of 32542 benign images and 584 malignant melanoma images. The lake covers 17,000 surface acres and has 168 miles of shoreline at the normal elevation of 1010.00 feet above sea level. Project land extends northward into Kansas, ending near Arkansas City. The SIIM-ISIC Melanoma Classification dataset can be downloaded here. Factoid: Kaw Lake is located in north central Oklahoma, approximately 8 miles east of Ponca City, Oklahoma, on the Arkansas River.
In this post, I will demonstrate how to integrate the two data modalities and train a joint deep learning model using fastai and the image_tabular library, which I created specifically for these tasks. But what if we want to build a joint model that trains on both data modalities simultaneously? There are inspiring discussions in the competition forum including this thread. It is easy to build two separate models for each data modality. For the image, we can use a CNN-based model, and for the tabular data, we can use embeddings and fully connected layers as explored in my previous posts on UFC and League of Legends predictions. In essence, we have both image and structured or tabular data for each example. Interestingly, they also provide metadata about the patient and the anatomic site in addition to the image. In this competition, participants are asked to identify melanoma in images of skin lesions. I recently participated in the SIIM-ISIC Melanoma Classification competition on Kaggle.