mednet.engine.classify.evaluator

Defines functionality for the evaluation of predictions.

Functions

eer_threshold(predictions)

Calculate the (approximate) threshold leading to the equal error rate.

make_plots(results)

Create plots for all curves and score distributions in results.

make_table(data, fmt)

Tabulate summaries from multiple splits.

maxf1_threshold(predictions)

Calculate the threshold leading to the maximum F1-score on a precision- recall curve.

run_binary(name, predictions, binning[, ...])

Run inference and calculates measures for binary classification.

mednet.engine.classify.evaluator.eer_threshold(predictions)[source]

Calculate the (approximate) threshold leading to the equal error rate.

Parameters:

predictions (Iterable[tuple[str, int, float]]) – An iterable of multiple models.classify.typing.BinaryPrediction’s.

Returns:

The EER threshold value.

Return type:

float

mednet.engine.classify.evaluator.maxf1_threshold(predictions)[source]

Calculate the threshold leading to the maximum F1-score on a precision- recall curve.

Parameters:

predictions (Iterable[tuple[str, int, float]]) – An iterable of multiple models.classify.typing.BinaryPrediction’s.

Returns:

The threshold value leading to the maximum F1-score on the provided set of predictions.

Return type:

float

mednet.engine.classify.evaluator.run_binary(name, predictions, binning, threshold_a_priori=None)[source]

Run inference and calculates measures for binary classification.

Parameters:
  • name (str) – The name of subset to load.

  • predictions (Iterable[tuple[str, int, float]]) – A list of predictions to consider for measurement.

  • binning (str | int) – The binning algorithm to use for computing the bin widths and distribution for histograms. Choose from algorithms supported by numpy.histogram().

  • threshold_a_priori (float | None) – A threshold to use, evaluated a priori, if must report single values. If this value is not provided, an a posteriori threshold is calculated on the input scores. This is a biased estimator.

Returns:

A tuple containing the following entries:

  • summary: A dictionary containing the performance summary on the specified threshold, general performance curves (under the key curves), and score histograms (under the key score-histograms).

Return type:

dict[str, Any]

mednet.engine.classify.evaluator.make_table(data, fmt)[source]

Tabulate summaries from multiple splits.

This function can properly tabulate the various summaries produced for all the splits in a prediction database.

Parameters:
Returns:

A string containing the tabulated information.

Return type:

str

mednet.engine.classify.evaluator.make_plots(results)[source]

Create plots for all curves and score distributions in results.

Parameters:

results (dict[str, dict[str, Any]]) – Evaluation data as returned by run_binary().

Return type:

list

Returns:

A list of figures to record to file