# Modules¶

## Evalutils¶

class evalutils.evalutils.Algorithm(*, index_key='input_image', file_loaders=None, file_filters=None, input_path=PosixPath('/input'), output_path=PosixPath('/output/images'), file_sorter_key=None, validators=None, output_file=PosixPath('/output/results.json'))[source]
__init__(*, index_key='input_image', file_loaders=None, file_filters=None, input_path=PosixPath('/input'), output_path=PosixPath('/output/images'), file_sorter_key=None, validators=None, output_file=PosixPath('/output/results.json'))[source]

The base class for all algorithms. Sets the environment and controls the flow of the processing once process is called.

Parameters
• index_key (str) – Fileloader key which must be used for the index. Default: input_image

• file_loaders (Optional[Dict[str, FileLoader]]) – The loaders that will be used to get all files. Default: evalutils.io.SimpleITKLoader for input_image

• file_filters (Optional[Dict[str, Optional[Pattern[str]]]]) – Regular expressions for filtering certain FileLoaders. Default: no filtering.

• input_path (Path) – The path in the container where the ground truth will be loaded from. Default: /input

• output_path (Path) – The path in the container where the output images will be written. Default: /output/images

• file_sorter_key (Optional[Callable]) – A function that determines how files in the input_path are sorted. Default: None (alphanumerical)

• validators (Optional[Dict[str, Tuple[DataFrameValidator, ...]]]) – A dictionary containing the validators that will be used on the loaded data per file_loader key. Default: evalutils.validators.UniqueImagesValidator for input_image

• output_file (PathLike) – The path to the location where the results will be written. Default: /output/results.json

validate()[source]

Validates each dataframe for each fileloader separately

class evalutils.evalutils.BaseEvaluation(*, ground_truth_path=PosixPath('/opt/evaluation/ground-truth'), predictions_path=PosixPath('/input'), file_sorter_key=<function first_int_in_filename_key>, file_loader, validators, join_key=None, aggregates=None, output_file=PosixPath('/output/metrics.json'))[source]
__init__(*, ground_truth_path=PosixPath('/opt/evaluation/ground-truth'), predictions_path=PosixPath('/input'), file_sorter_key=<function first_int_in_filename_key>, file_loader, validators, join_key=None, aggregates=None, output_file=PosixPath('/output/metrics.json'))[source]

The base class for all evaluations. Sets the environment and controls the flow of the evaluation once evaluate is called.

Parameters
• ground_truth_path (Path) – The path in the container where the ground truth will be loaded from

• predictions_path (Path) – The path in the container where the submission will be loaded from

• file_sorter_key (Callable) – A function that determines how files are sorted and matched together

• file_loader (FileLoader) – The loader that will be used to get all files

• validators (Tuple[DataFrameValidator, ...]) – A tuple containing all the validators that will be used on the loaded data

• join_key (Optional[str]) – The column that will be used to join the predictions and ground truth tables

• aggregates (Optional[Set[str]]) – The set of aggregates that will be calculated by pandas.DataFrame.describe

• output_file (PathLike) – The path to the location where the results will be written

abstract cross_validate()[source]

Validates both dataframes

validate()[source]

Validates each dataframe separately

class evalutils.evalutils.ClassificationAlgorithm(*, index_key='input_image', file_loaders=None, file_filters=None, input_path=PosixPath('/input'), output_path=PosixPath('/output/images'), file_sorter_key=None, validators=None, output_file=PosixPath('/output/results.json'))[source]
class evalutils.evalutils.ClassificationEvaluation(*, ground_truth_path=PosixPath('/opt/evaluation/ground-truth'), predictions_path=PosixPath('/input'), file_sorter_key=<function first_int_in_filename_key>, file_loader, validators, join_key=None, aggregates=None, output_file=PosixPath('/output/metrics.json'))[source]

ClassificationEvaluations have the same number of predictions as the number of ground truth cases. These can be things like, what is the stage of this case, or segment some things in this case.

cross_validate()[source]

Validates both dataframes

class evalutils.evalutils.DetectionAlgorithm(*, index_key='input_image', file_loaders=None, file_filters=None, input_path=PosixPath('/input'), output_path=PosixPath('/output/images'), file_sorter_key=None, validators=None, output_file=PosixPath('/output/results.json'))[source]
class evalutils.evalutils.DetectionEvaluation(*args, detection_radius, detection_threshold, **kwargs)[source]

DetectionEvaluations have a different number of predictions from the number of ground truth annotations. An example would be detecting lung nodules in a CT volume, or malignant cells in a pathology slide.

__init__(*args, detection_radius, detection_threshold, **kwargs)[source]

The base class for all evaluations. Sets the environment and controls the flow of the evaluation once evaluate is called.

Parameters
• ground_truth_path – The path in the container where the ground truth will be loaded from

• predictions_path – The path in the container where the submission will be loaded from

• file_sorter_key – A function that determines how files are sorted and matched together

• file_loader – The loader that will be used to get all files

• validators – A tuple containing all the validators that will be used on the loaded data

• join_key – The column that will be used to join the predictions and ground truth tables

• aggregates – The set of aggregates that will be calculated by pandas.DataFrame.describe

• output_file – The path to the location where the results will be written

cross_validate()[source]

Validates both dataframes

class evalutils.evalutils.Evaluation(*args, **kwargs)[source]

Legacy class, you should use ClassificationEvaluation instead.

__init__(*args, **kwargs)[source]

The base class for all evaluations. Sets the environment and controls the flow of the evaluation once evaluate is called.

Parameters
• ground_truth_path – The path in the container where the ground truth will be loaded from

• predictions_path – The path in the container where the submission will be loaded from

• file_sorter_key – A function that determines how files are sorted and matched together

• file_loader – The loader that will be used to get all files

• validators – A tuple containing all the validators that will be used on the loaded data

• join_key – The column that will be used to join the predictions and ground truth tables

• aggregates – The set of aggregates that will be calculated by pandas.DataFrame.describe

• output_file – The path to the location where the results will be written

class evalutils.evalutils.SegmentationAlgorithm(*, index_key='input_image', file_loaders=None, file_filters=None, input_path=PosixPath('/input'), output_path=PosixPath('/output/images'), file_sorter_key=None, validators=None, output_file=PosixPath('/output/results.json'))[source]

## IO¶

class evalutils.io.CSVLoader[source]
load(*, fname)[source]

Tries to load the file given by the path fname.

Notes

For this to work with the validators you must:

If you load an image it must save the hash in the hash column

If you reference a Path it must be saved in the path column

Parameters

fname (Path) – The file that the loader will try to load

Return type

A list containing all of the cases in this file

Raises

FileLoaderError – If a file cannot be loaded as the specified type

class evalutils.io.FileLoader[source]
abstract load(*, fname)[source]

Tries to load the file given by the path fname.

Notes

For this to work with the validators you must:

If you load an image it must save the hash in the hash column

If you reference a Path it must be saved in the path column

Parameters

fname (Path) – The file that the loader will try to load

Return type

A list containing all of the cases in this file

Raises

FileLoaderError – If a file cannot be loaded as the specified type

class evalutils.io.ImageIOLoader[source]
static hash_image(image)[source]

Generates a hash of the image

Parameters

image – The image to hash

Return type

The hash of the image

static load_image(fname)[source]

Loads the image

Parameters

fname – The path that the loader will try to load

Return type

The image

class evalutils.io.ImageLoader[source]

A specialised file loader for images. As images are large they will not all be loaded into memory, so score_case needs to load them again later via load_image.

static hash_image(image)[source]

Generates a hash of the image

Parameters

image – The image to hash

Return type

The hash of the image

load(*, fname)[source]

Tries to load the file given by the path fname.

Notes

For this to work with the validators you must:

If you load an image it must save the hash in the hash column

If you reference a Path it must be saved in the path column

Parameters

fname (Path) – The file that the loader will try to load

Return type

A list containing all of the cases in this file

Raises

FileLoaderError – If a file cannot be loaded as the specified type

static load_image(fname)[source]

Loads the image

Parameters

fname (Path) – The path that the loader will try to load

Return type

The image

class evalutils.io.SimpleITKLoader[source]
static hash_image(image)[source]

Generates a hash of the image

Parameters

image – The image to hash

Return type

The hash of the image

static load_image(fname)[source]

Loads the image

Parameters

fname – The path that the loader will try to load

Return type

The image

evalutils.io.get_first_int_in(s)[source]

Gets the first integer in a string.

Parameters

s (str) – The string to search for an int

Return type

The first integer found in the string

Raises

AttributeError – If there is not an int contained in the string

## Validators¶

class evalutils.validators.DataFrameValidator[source]
abstract validate(*, df)[source]

Validates a single aspect of a DataFrame

Parameters

df (DataFrame) – The DataFrame to be validated

Return type

None if the DataFrame is valid

Raises

ValidationError – If the DataFrame is not valid

class evalutils.validators.ExpectedColumnNamesValidator(*, expected, extra_cols_check=True)[source]
__init__(*, expected, extra_cols_check=True)[source]

Validates that the DataFrame has the expected columns

Parameters
• expected (Tuple[str, ...]) – The expected columns in the DataFrame

• extra_cols_check (bool) – Perform the check for extra columns, default is true but you may want to disable this if you’re sure that extra columns can be ignored.

Raises

ValueError – If no columns are defined

validate(*, df)[source]

Validates a single aspect of a DataFrame

Parameters

df (DataFrame) – The DataFrame to be validated

Return type

None if the DataFrame is valid

Raises

ValidationError – If the DataFrame is not valid

class evalutils.validators.NumberOfCasesValidator(*, num_cases)[source]
__init__(*, num_cases)[source]

Validates that there are the correct number of cases in the set.

Parameters

num_cases (int) – The number of cases that we expect to find.

validate(*, df)[source]

Validates a single aspect of a DataFrame

Parameters

df (DataFrame) – The DataFrame to be validated

Return type

None if the DataFrame is valid

Raises

ValidationError – If the DataFrame is not valid

class evalutils.validators.UniqueImagesValidator[source]

Validates that each image in the set is unique

validate(*, df)[source]

Validates a single aspect of a DataFrame

Parameters

df (DataFrame) – The DataFrame to be validated

Return type

None if the DataFrame is valid

Raises

ValidationError – If the DataFrame is not valid

class evalutils.validators.UniquePathIndicesValidator[source]

Validates that the indicies from the filenames are unique

validate(*, df)[source]

Validates a single aspect of a DataFrame

Parameters

df (DataFrame) – The DataFrame to be validated

Return type

None if the DataFrame is valid

Raises

ValidationError – If the DataFrame is not valid

## Scorers¶

class evalutils.scorers.DetectionScore(true_positives, false_negatives, false_positives)
property false_negatives

Alias for field number 1

property false_positives

Alias for field number 2

property true_positives

Alias for field number 0

evalutils.scorers.find_hits_for_targets(*, targets, predictions, radius)[source]

Generates a list of the predicted points that are within a radius r of the targets. The indicies are returned in sorted order, from closest to farthest point.

Parameters
• targets (List[Tuple[float, ...]]) – A list of target points

• predictions (List[Tuple[float, ...]]) – A list of predicted points

• radius (float) – The maximum distance that two points can be apart for them to be considered a hit

Return type

List[Tuple[int, ...]]

Returns

• A list which has the same length as the targets list. Each element within

• this list contains another list that contains the indicies of the

• predictions that are considered hits.

evalutils.scorers.score_detection(*, ground_truth, predictions, radius=1.0)[source]

Generates the number of true positives, false positives and false negatives for the ground truth points given the predicted points.

If multiple predicted points hit one ground truth point then this is considered as 1 true positive, and 0 false negatives.

If one predicted point is a hit for N ground truth points then this is considered as 1 true positive, and N-1 false negatives.

Parameters
• ground_truth (List[Tuple[float, ...]]) – A list of the ground truth points

• predictions (List[Tuple[float, ...]]) – A list of the predicted points

• radius (float) – The maximum distance that two points can be separated by in order to be considered a hit

Return type

DetectionScore

Returns

• A tuple containing the number of true positives, false positives and

• false negatives.

## Annotations¶

class evalutils.annotations.BoundingBox(*, x1, x2, y1, y2)[source]
__init__(*, x1, x2, y1, y2)[source]

A bounding box is a face defined by 4 edges on a 2D plane. It must have a non-zero width and height.

Parameters
• x1 (float) – Left edge of the bounding box

• x2 (float) – Right edge of the bounding box

• y1 (float) – Bottom edge of the bounding box

• y2 (float) – Top edge of the bounding box

Raises

ValueError – If the bounding box has zero width or height

property area: float

Return the area of the bounding box in natural units

Return type

float

intersection(*, other)[source]

Calculates the intersection area between this bounding box and another, axis aligned, bounding box.

Parameters

other (BoundingBox) – The other bounding box

Returns

The intersection area in natural units if the two bounding boxes overlap, zero otherwise.

Return type

float

jaccard_index(*, other)[source]

Calculates the intersection over union between this bounding box and a second, axis aligned, bounding box.

Parameters

other (BoundingBox) – The other bounding box

Returns

The intersection over union in natural units

Return type

float

union(*, other)[source]

Calculates the union between this bounding box and another, axis aligned, bounding box.

Parameters

other (BoundingBox) – The other bounding box

Returns

The union area in natural units

Return type

float

## Statistics¶

class evalutils.stats.HausdorffMeasures(distance, modified_distance, percentile_distance)
property distance

Alias for field number 0

property modified_distance

Alias for field number 1

property percentile_distance

Alias for field number 2

evalutils.stats.absolute_volume_difference(s1, s2, voxelspacing=None)[source]

Calculate absolute volume difference from s2 to s1

Parameters
• s1 (ndarray) – Input data containing objects. Can be any type but will be converted into binary: background where 0, object everywhere else. s1 is taken to be the reference.

• s2 (ndarray) – Input data containing objects. Can be any type but will be converted into binary: background where 0, object everywhere else.

• voxelspacing (Union[Tuple[Union[float, int], ...], List[Union[float, int]], float, int, None]) – The voxelspacing in a distance unit i.e. spacing of elements along each dimension. If a sequence, must be of length equal to the input rank; if a single number, this is used for all axes. If not specified, a grid spacing of unity is implied.

Return type

float

Returns

• The absolute volume difference between the object(s) in input1

• and the object(s) in input2. This is a percentage value in the

• range $$[0, +inf]$$ for which a $$0$$ denotes an ideal score.

Notes

This is a real metric

evalutils.stats.accuracies_from_confusion_matrix(cm)[source]

Computes accuracy scores from a confusion matrix

Parameters

cm (ndarray) – N x N Input confusion matrix

Return type

1d ndarray containing accuracy scores for all N classes

evalutils.stats.calculate_confusion_matrix(y_true, y_pred, labels)[source]

Efficient confusion matrix calculation, based on sklearn interface

Parameters
• y_true (ndarray) – Target multi-object segmentation mask

• y_pred (ndarray) – Predicted multi-object segmentation mask

• labels (List[int]) – Inclusive list of N labels to compute the confusion matrix for.

Return type

N x N confusion matrix for Y_pred w.r.t. Y_true

Notes

By definition a confusion matrix $$C$$ is such that $$C_{i, j}$$ is equal to the number of observations known to be in group $$i$$ but predicted to be in group $$j$$.

evalutils.stats.dice_from_confusion_matrix(cm)[source]

Computes Dice scores from a confusion matrix

Parameters

cm (ndarray) – N x N Input confusion matrix

Return type

1d ndarray containing Dice scores for all N classes

evalutils.stats.dice_to_jaccard(dice)[source]

Conversion computation from Dice to Jaccard

Parameters

dice (ndarray) – 1 or N Dice values within [0 .. 1]

Return type

1 or N Jaccard values within [0 .. 1]

evalutils.stats.distance_transform_edt_float32(input, sampling=None, return_distances=True, return_indices=False, distances=None, indices=None)[source]

Memory efficient version of scipy.ndimage.distance_transform_edt

The same as scipy.ndimage.distance_transform_edt but using float32 and better memory cleaning internally.

In addition to the distance transform, the feature transform can be calculated. In this case the index of the closest background element is returned along the first axis of the result.

Parameters
• input – Input data to transform. Can be any type but will be converted into binary: 1 wherever input equates to True, 0 elsewhere.

• sampling – Spacing of elements along each dimension. If a sequence, must be of length equal to the input rank; if a single number, this is used for all axes. If not specified, a grid spacing of unity is implied.

• return_distances – Whether to return distance matrix. At least one of return_distances/return_indices must be True. Default is True.

• return_indices – Whether to return indices matrix. Default is False.

• distances – Used for output of distance array, must be of type float64.

• indices – Used for output of indices, must be of type int32.

Returns

Either distance matrix, index matrix, or a list of the two, depending on return_x flags and distance and indices input parameters.

Return type

distance_transform_edt

Notes

The euclidean distance transform gives values of the euclidean distance:

              n
y_i = sqrt(sum (x[i]-b[i])**2)
i


where b[i] is the background point (value 0) with the smallest Euclidean distance to input points x[i], and n is the number of dimensions.

Copyright (C) 2003-2005 Peter J. Verveer

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

3. The name of the author may not be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE AUTHOR ‘’AS IS’’ AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

evalutils.stats.hausdorff_distance(s1, s2, voxelspacing=None, connectivity=1, edt_method=<function distance_transform_edt_float32>)[source]

Computes the (symmetric) Hausdorff Distance (HD) between the binary objects in two images. It is defined as the maximum surface distance between the objects.

Parameters
• s1 (ndarray) – Input data containing objects. Can be any type but will be converted into binary: background where 0, object everywhere else.

• s2 (ndarray) – Input data containing objects. Can be any type but will be converted into binary: background where 0, object everywhere else.

• voxelspacing (Union[Tuple[Union[float, int], ...], List[Union[float, int]], float, int, None]) – The voxelspacing in a distance unit i.e. spacing of elements along each dimension. If a sequence, must be of length equal to the input rank; if a single number, this is used for all axes. If not specified, a grid spacing of unity is implied.

• connectivity (int) – The neighbourhood/connectivity considered when determining the surface of the binary objects. This value is passed to scipy.ndimage.generate_binary_structure and should usually be $$> 1$$.

• edt_method (Callable) – Method used for computing the euclidean distance transform. By default it uses a variant on the scipy.ndimage.distance_transform_edt method that uses float32 data to reduce memory costs at the cost of some additional compute time.

Return type

float

Returns

• The symmetric Hausdorff Distance between the object(s) in s1 and

• the object(s) in s2. The distance unit is the same as for the

• spacing of elements along each dimension, which is usually given in mm.

Notes

This is a real metric. Implementation inspired by medpy.metric.binary http://pythonhosted.org/MedPy/_modules/medpy/metric/binary.html

evalutils.stats.hausdorff_distance_measures(s1, s2, voxelspacing=None, connectivity=1, percentile=0.95, edt_method=<function distance_transform_edt_float32>)[source]

Returns multiple Hausdorff measures - (hd, modified_hd, percentile_hd) Since measures share common calculations, together the measures can be calculated more efficiently

Parameters
• s1 (ndarray) – Input data containing objects. Can be any type but will be converted into binary: background where 0, object everywhere else.

• s2 (ndarray) – Input data containing objects. Can be any type but will be converted into binary: background where 0, object everywhere else.

• voxelspacing (Union[Tuple[Union[float, int], ...], List[Union[float, int]], float, int, None]) – The voxelspacing in a distance unit i.e. spacing of elements along each dimension. If a sequence, must be of length equal to the input rank; if a single number, this is used for all axes. If not specified, a grid spacing of unity is implied.

• connectivity (int) – The neighbourhood/connectivity considered when determining the surface of the binary objects. This value is passed to scipy.ndimage.generate_binary_structure and should usually be $$> 1$$.

• percentile (float) – The percentile at which to calculate the Hausdorff Distance

• edt_method (Callable) – Method used for computing the euclidean distance transform. By default it uses a variant on the scipy.ndimage.distance_transform_edt method that uses float32 data to reduce memory costs at the cost of some additional compute time.

Return type

HausdorffMeasures

Returns

• The hausdorff distance, modified hausdorff distance and percentile

• hausdorff distance

Notes

This returns real metrics.

evalutils.stats.jaccard_from_confusion_matrix(cm)[source]

Computes Jaccard scores from a confusion matrix a.k.a. intersection over union (IoU)

Parameters

cm (ndarray) – N x N Input confusion matrix

Return type

1d ndarray containing Jaccard scores for all N classes

evalutils.stats.jaccard_to_dice(jacc)[source]

Conversion computation from Jaccard to Dice

Parameters

jacc (ndarray) – 1 or N Jaccard values within [0 .. 1]

Return type

1 or N Dice values within [0 .. 1]

evalutils.stats.mean_contour_distance(s1, s2, voxelspacing=None, edt_method=<function distance_transform_edt_float32>)[source]

Computes the (symmetric) Mean Contour Distance between the binary objects in two images. It is defined as the maximum average surface distance between the objects.

Parameters
• s1 (ndarray) – Input data containing objects. Can be any type but will be converted into binary: background where 0, object everywhere else.

• s2 (ndarray) – Input data containing objects. Can be any type but will be converted into binary: background where 0, object everywhere else.

• voxelspacing (Union[Tuple[Union[float, int], ...], List[Union[float, int]], float, int, None]) – The voxelspacing in a distance unit i.e. spacing of elements along each dimension. If a sequence, must be of length equal to the input rank; if a single number, this is used for all axes. If not specified, a grid spacing of unity is implied.

• edt_method (Callable) – Method used for computing the euclidean distance transform. By default it uses a variant on the scipy.ndimage.distance_transform_edt method that uses float32 data to reduce memory costs at the cost of some additional compute time.

Return type

float

Returns

• The symmetric Mean Contour Distance between the object(s) in s1

• and the object(s) in s2. The distance unit is the same as for the

• spacing of elements along each dimension, which is usually given in mm.

Notes

This is a real metric that mimics the ITK MeanContourDistanceFilter.

evalutils.stats.modified_hausdorff_distance(s1, s2, voxelspacing=None, connectivity=1, edt_method=<function distance_transform_edt_float32>)[source]

Computes the (symmetric) Modified Hausdorff Distance (MHD) between the binary objects in two images. It is defined as the maximum average surface distance between the objects.

Parameters
• s1 (ndarray) – Input data containing objects. Can be any type but will be converted into binary: background where 0, object everywhere else.

• s2 (ndarray) – Input data containing objects. Can be any type but will be converted into binary: background where 0, object everywhere else.

• voxelspacing (Union[Tuple[Union[float, int], ...], List[Union[float, int]], float, int, None]) – The voxelspacing in a distance unit i.e. spacing of elements along each dimension. If a sequence, must be of length equal to the input rank; if a single number, this is used for all axes. If not specified, a grid spacing of unity is implied.

• connectivity (int) – The neighbourhood/connectivity considered when determining the surface of the binary objects. This value is passed to scipy.ndimage.generate_binary_structure and should usually be $$> 1$$.

• edt_method (Callable) – Method used for computing the euclidean distance transform. By default it uses a variant on the scipy.ndimage.distance_transform_edt method that uses float32 data to reduce memory costs at the cost of some additional compute time.

Return type

float

Returns

• The symmetric Modified Hausdorff Distance between the object(s) in

• s1 and the object(s) in s2. The distance unit is the same

• as for the spacing of elements along each dimension, which is usually

• given in mm.

Notes

This is a real metric.

evalutils.stats.percentile_hausdorff_distance(s1, s2, percentile=0.95, voxelspacing=None, connectivity=1, edt_method=<function distance_transform_edt_float32>)[source]

Nth Percentile Hausdorff Distance.

Computes a percentile for the (symmetric) Hausdorff Distance between the binary objects in two images. It is defined as the maximum surface distance between the objects at the nth percentile.

Parameters
• s1 (ndarray) – Input data containing objects. Can be any type but will be converted into binary: background where 0, object everywhere else.

• s2 (ndarray) – Input data containing objects. Can be any type but will be converted into binary: background where 0, object everywhere else.

• percentile (Union[int, float]) – The percentile to perform the comparison on the two sorted distance sets

• voxelspacing (Union[Tuple[Union[float, int], ...], List[Union[float, int]], float, int, None]) – The voxelspacing in a distance unit i.e. spacing of elements along each dimension. If a sequence, must be of length equal to the input rank; if a single number, this is used for all axes. If not specified, a grid spacing of unity is implied.

• connectivity (int) – The neighbourhood/connectivity considered when determining the surface of the binary objects. This value is passed to scipy.ndimage.generate_binary_structure and should usually be $$> 1$$.

• edt_method (Callable) – Method used for computing the euclidean distance transform. By default it uses a variant on the scipy.ndimage.distance_transform_edt method that uses float32 data to reduce memory costs at the cost of some additional compute time.

Return type

float

Returns

• The maximum Percentile Hausdorff Distance between the object(s) in

• s1 and the object(s) in s2 at the percentile

• percentile.

• The distance unit is the same as for the spacing of elements along each

• dimension, which is usually given in mm.

See also

hd()

Notes

This is a real metric.

evalutils.stats.relative_absolute_volume_difference(s1, s2)[source]

Calculate relative absolute volume difference from s2 to s1

Parameters
• s1 (ndarray) – Input data containing objects. Can be any type but will be converted into binary: background where 0, object everywhere else. s1 is taken to be the reference.

• s2 (ndarray) – Input data containing objects. Can be any type but will be converted into binary: background where 0, object everywhere else.

Return type

float

Returns

• The relative absolute volume difference between the object(s) in

• input1 and the object(s) in input2. This is a percentage value

• in the range $$[0, +inf]$$ for which a $$0$$ denotes an ideal

• score.

Notes

This is not a real metric! it is asymmetric.

## ROC¶

class evalutils.roc.BootstrappedCIPointError(mean_fprs, mean_tprs, low_tpr_vals, high_tpr_vals, low_fpr_vals, high_fpr_vals)[source]
property high_fpr_vals

Alias for field number 5

property high_tpr_vals

Alias for field number 3

property low_fpr_vals

Alias for field number 4

property low_tpr_vals

Alias for field number 2

property mean_fprs

Alias for field number 0

property mean_tprs

Alias for field number 1

class evalutils.roc.BootstrappedROCCICurves(fpr_vals, mean_tpr_vals, low_tpr_vals, high_tpr_vals, low_az_val, high_az_val)[source]
property fpr_vals

Alias for field number 0

property high_az_val

Alias for field number 5

property high_tpr_vals

Alias for field number 3

property low_az_val

Alias for field number 4

property low_tpr_vals

Alias for field number 2

property mean_tpr_vals

Alias for field number 1

evalutils.roc.average_roc_curves(roc_curves, bins=200)[source]

Averages ROC curves using vertical averaging (fixed FP rates), which gives a 1D measure of variability.

Parameters
• curves – List of BootstrappedROCCICurves to be averaged

• (optional) (bins) – Number of false-positives to iterate over. (Default: 200)

Returns

ROC class containing the average over all ROCs.

Return type

BootstrappedROCCICurves

evalutils.roc.get_bootstrapped_ci_point_error(y_score, y_true, num_bootstraps=100, ci_to_use=0.95, exclude_first_last=True)[source]

Produces Confidence-Interval errors for individual points from ROC Useful when only few ROC points exist so they will be plotted individually e.g. when range of score values in y_score is very small (e.g. manual observer scores)

Note that this method only works by analysing the cloud of boostrapped points generatedfor a particular threshold value. A fixed number of threshold values is essential. Therefore the scores in y_score must be from a fixed discrete set of values, eg. [1,2,3,4,5]

Bootstrapping is done by selecting len(y_score) samples randomly (with replacement) from y_score and y_true. This is done num_boostraps times.

Parameters
• y_score (ndarray) – The scores produced by the system being evaluated. A discrete set of possible scores must be used.

• y_true (ndarray) – The true labels (1 or 0) which are the reference standard being used

• num_bootstraps (integer) – How many times to make a random sample with replacement

• ci_to_use (float) – Which confidence interval is required.

• exclude_first_last (bool) – The first and last ROC point (0,0 and 1,1) are usually irrelevant in these scenarios where only a few ROC points will be individually plotted. Set this to true to ignore these first and last points.

Return type

BootstrappedCIPointError

Returns

• mean_fprs – The array of mean fpr values (1 per possible ROC point)

• mean_tprs – The array of mean tpr values (1 per possible ROC point)

• low_tpr_vals – The tpr vals (one per ROC point) representing lowest val in CI

• high_tpr_vals – The tpr vals (one per ROC point) representing the highest val in CI

• low_fpr_vals – The fpr vals (one per ROC point) representing lowest val in CI_to_use

• high_fpr_vals – The fpr vals (one per ROC point) representing the highest val in CI

evalutils.roc.get_bootstrapped_roc_ci_curves(y_pred, y_true, num_bootstraps=100, ci_to_use=0.95)[source]

Produces Confidence-Interval Curves to go alongside a regular ROC curve This is done by using boostrapping. Bootstrapping is done by selecting len(y_pred) samples randomly (with replacement) from y_pred and y_true. This is done num_boostraps times.

Parameters
• y_pred (ndarray) – The predictions (scores) produced by the system being evaluated

• y_true (ndarray) – The true labels (1 or 0) which are the reference standard being used

• num_bootstraps (int) – How many times to make a random sample with replacement

• ci_to_use (float) – Which confidence interval is required.

Return type

BootstrappedROCCICurves

Returns

• fpr_vals – An equally spaced set of fpr vals between 0 and 1

• mean_tpr_vals – The mean tpr vals (one per fpr_val) obtained by boostrapping

• low_tpr_vals – The tpr vals (one per fpr_val) representing lower curve for CI

• high_tpr_vals – The tpr vals (one per fpr_val) representing the upper curve for CI

• low_Az_val – The lower Az (AUC) val for the given CI_to_use

• high_Az_val – The higher Az (AUC) val for the given CI_to_use