Interpretable image-based machine learning models in healthcare
Neural networks can be great at solving problems, but they sometimes give wrong answers. Would you trust an algorithm that got life-saving information right 80% of the time?
Summer of Research
Project by Harper Shen, University of Auckland, supervised by Quentin Thurier (Orion Health) and Dr Yun Sing Koh (University of Auckland).Neural networks can be great at solving problems, but they sometimes give wrong answers. Would you trust an algorithm that got life-saving information right 80% of the time? It is easier to trust when you can see how it made its prediction.
Neural networks have been used successfully to analyse medical images with higher accuracy than humans, particularly radiology images such as x-rays. However, these algorithms can be easily fooled by random patterns, and sometimes make mistakes that would be obvious to a human. [1]
Harper Shen developed a neural network that identifies skin cancer from photos and creates easily-interpretable images to show how it came to its conclusion. A doctor or even an untrained person can look at these images and see if the algorithm analysed the wrong part of the image.
The algorithm is a convolutional neural network, a deep neural network with multiple layers for image analysis. It is trained with the ISIC 2018 dataset of skin images, and it identifies different types of cancerous and non-cancerous skin lesions.
Training a neural network for medical images is difficult. Modern neural networks are designed to identify distinct objects such as bicycles, tables, and people. Different types of skin lesion have similar attributes to analyse, such as dots and streaks. Some images contain two lesion types.
Harper’s neural network uses two types of image interpretation: Class activation maps and saliency maps. Each has a different method of detecting changes in colour across an image. Class activation maps are more widely used to interpret medical images, and they are used to improve accuracy in Harper’s algorithm.
The neural network produces two heat maps, one for each of the algorithm’s interpretation methods. These look like the original image, but with hot spots to show which parts of the image the algorithm analysed to come to its conclusion.
If the hot spot is directly over the skin lesion, the user can be reasonably confident of the result. If the hot spot is over another part of the image, such as a birthmark, scar, body hair, or shadow, the user will know the analysis is inaccurate.
The algorithm may offer two different predictions of a lesion’s type, with hotspots to guide the user as to which result is more likely correct. It is more useful than standalone neural network with no heat maps, where the user must guess which of the analysis methods to trust based on probability.
Harper added a second stage to the algorithm to improve accuracy using information from the heat maps. The two-stage algorithm was 83.6% accurate in testing, and the heat maps demonstrate a way for users to easily see how it came to its predictions.
Harper Shen is one of 10 students who took part in the Summer of Research programme funded by Precision Driven Health. The research is at an early “proof of concept” stage. The projects offer fresh insights into what healthcare will look like when precision medicine is widely used.
- Anh Nguyen, Jason Yosinski, and Jeff Clune. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 427–436, 2015.