Saliency Generation and Analysis (classification)¶

A saliency map highlights areas of interest within an image, that contributed to the produced score. For example, in the context of tuberculosis detection from chest X-ray images, this would be the locations in those images where tuberculosis is (supposedly) present.

This package provides scripts that can generate saliency maps and compute relevant metrics for evaluating the performance of saliency map algorithms taking into consideraton result completeness, and human interpretability. Result completeness evaluates how much of the output score is explained by the computed saliency maps. Human interpretability evaluates how much of the generated saliency map matches human expectations when performing the same task.

Evaluation of human interpretability obviously requires the use of a datamodule with human-annotated saliency information that is supposed to correlate with image labels.

The command-line interface for saliency map generation and analysis is available through the mednet classify saliency subcommands. The commands should be called in sequence to generate intermediate outputs required for subsequent commands:

Fig. 2 Overview of CLI commands for saliency generation and evaluation. Clicking on each item leads to the appropriate specific documentation. Saliency generation can be done for any datamodule split. In this figure, only the Test data set is displayed for illustrative purposes.¶

Saliency Generation¶

Saliency maps can be generated with the saliency generate command. They are represented as numpy arrays of the same size as thes images, with values in the range [0-1] and saved in .npy files.

Several saliency mapping algorithms are available to choose from, which can be specified with the -s option. The default algorithm is GradCAM.

To generate saliency maps for all splits in a datamodule, run a command such as:

mednet classify saliency generate -vv pasa tbx11k-v1-healthy-vs-atb --weight=path/to/model-at-lowest-validation-loss.ckpt --output-folder=results

Viewing¶

To overlay saliency maps over the original images, use the saliency view command. Results are saved as PNG images in which brigter pixels correspond to areas with higher saliency.

To generate visualizations, run a command such as:

# input-folder is the location of the saliency maps created as per above
mednet classify saliency view -vv pasa tbx11k-v1-healthy-vs-atb --input-folder=input-folder --output-folder=results

Completeness¶

The saliency completeness script computes ROAD scores of saliency maps and saves them in a JSON file. The ROAD algorithm [RLB+22] estimates the explainability (in the completeness sense) of saliency maps by substituting relevant pixels in the input image by a local average, re-running prediction on the altered image, and measuring changes in the output classification score when said perturbations are in place. By substituting most or least relevant pixels with surrounding averages, the ROAD algorithm estimates the importance of such elements in the produced saliency map.

To run completeness analysis for a given model and saliency-map algorithm on all splits of a datamodule, use the saliency completeness command. ROAD scores for each input sample are computed and stored in a JSON file for later analysis. For example:

mednet classify saliency completeness -vv pasa tbx11k-v1-healthy-vs-atb --device="cuda:0" --weight=path/to/model-at-lowest-validation-loss.ckpt --output-folder=results

Note

Running the completness analysis on a GPU is strongly advised. The algorithm requires multiple model passes per sample.
The target datamodule does NOT require specific annotations for this analysis

Interpretability¶

Given a target label, the interpretability step computes the proportional energy and average saliency focus in a datamodule. The proportional energy is defined as the quantity of activation that lies within the ground truth boxes compared to the total sum of the activations. The average saliency focus is the sum of the values of the saliency map over the ground-truth bounding boxes, normalized by the total area covered by all ground-truth bounding boxes.

To run interpretability analysis for a given model and saliency-map algorithm on all splits of a datamodule, use the saliency interpretability command. The average egnery and saliency focus features for each input sample are computed and stored in a JSON file for later analysis. For example:

mednet saliency interpretability -vv tbx11k-v1-healthy-vs-atb --input-folder=parent-folder/saliencies/ --output-json=path/to/interpretability-scores.json

Note

Currently, this functionality requires a datamodule containing human-annotated bounding boxes.