mednet.engine.classify.evaluator¶
Defines functionality for the evaluation of classification predictions.
Functions
|
Calculate the (approximate) threshold leading to the equal error rate. |
|
Create plots for all curves and score distributions in |
|
Tabulate summaries from multiple splits. |
|
Calculate the threshold leading to the maximum F1-score on a precision- recall curve. |
|
Run inference and calculates measures for binary or multilabel classification. |
- mednet.engine.classify.evaluator.eer_threshold(predictions)[source]¶
Calculate the (approximate) threshold leading to the equal error rate.
For multi-label problems, calculate the EER threshold in the “micro” sense by first rasterizing all scores and labels (with
numpy.ravel()
), and then using this (large) 1D vector like in a binary classifier.
- mednet.engine.classify.evaluator.maxf1_threshold(predictions)[source]¶
Calculate the threshold leading to the maximum F1-score on a precision- recall curve.
For multi-label problems, calculate the maximum F1-core threshold in the “micro” sense by first rasterizing all scores and labels (with
numpy.ravel()
), and then using this (large) 1D vector like in a binary classifier.
- mednet.engine.classify.evaluator.run(name, predictions, binning, rng, threshold_a_priori=None, credible_regions=False)[source]¶
Run inference and calculates measures for binary or multilabel classification.
For multi-label problems, calculate the metrics in the “micro” sense by first rasterizing all scores and labels (with
numpy.ravel()
), and then using this (large) 1D vector like in a binary classifier.- Parameters:
name (
str
) – The name of subset to load.predictions (
Sequence
[tuple
[str
,Sequence
[int
],Sequence
[float
]]]) – A list of predictions to consider for measurement.binning (
str
|int
) – The binning algorithm to use for computing the bin widths and distribution for histograms. Choose from algorithms supported bynumpy.histogram()
.rng (
Generator
) – An initialized numpy random number generator.threshold_a_priori (
float
|None
) – A threshold to use, evaluated a priori, if must report single values. If this value is not provided, an a posteriori threshold is calculated on the input scores. This is a biased estimator.credible_regions (
bool
) – If set toTrue
, then returns also credible intervals viacredible.bayesian.metrics
. Notice the evaluation of ROC-AUC and Average Precision confidence margins can be rather slow for larger datasets.
- Returns:
A dictionary containing the performance summary on the specified threshold, general performance curves (under the key
curves
), and score histograms (under the keyscore-histograms
).- Return type: