Documentation automation rate analysis #2

davidggphy · 2020-08-31T14:20:49Z

Could you provide some documentation or at least a link about the automation rate analysis plot in classification reports?

Thanks!

magnusja · 2020-09-25T13:45:50Z

I might be able to shed a little light on this.

The plot illustrates how many samples can be "automated" at what performance. Automated means, we have a certain threshold at which we consider the model output as confident enough to make a decision (ie. model output > threshold). So lets say you want to have a 99% of accuracy, automation rate describes how many samples you are confident enough to be able to achieve that performance goal (based on some sort of validation/test set). For the other samples you cannot guarantee a this performance and therefore cannot make a decision. This is based that the model output (eg. softmax) can be interpreted as a confidence score of the network.

marlonjan · 2020-12-30T11:35:28Z

Hi, and sorry for the late reply. Thanks for helping out @magnusja, that's a great explanation.

I'm considering removing the automation rate plot from the plots that are displayed by default. It's fairly non-standard and not as self-explanatory as I would like it to be. For example, an include_experimental_features: bool = False parameter could be added to the compare_classifiers function, and then the automation rate analysis would not be displayed unless someone really wants to see it.

jmrichardson · 2022-06-10T19:36:29Z

Hi @marlonjan ,

This include_experimental_features=False doesn't work for with compare_classifiers. Is there another parameter I can use to remove the Automation rate analysis?

marlonjan · 2022-06-11T11:14:20Z

Hi @jmrichardson,

Thanks for trying out the package and asking the question! The include_experimental_features doesn't exist yet, but you can hide the automation rate analysis by filtering the plots: filter_figures=lambda figure_title: "Automation Rate" not in figure_title.

Here is an example (with the interesting bit at the very end):

import metriculous
import numpy as np


def normalize(array2d: np.ndarray) -> np.ndarray:
    return array2d / array2d.sum(axis=1, keepdims=True)


class_names = ["Cat", "Dog", "Pig"]
num_classes = len(class_names)
num_samples = 500

# Mock ground truth
ground_truth = np.random.choice(range(num_classes), size=num_samples, p=[0.5, 0.4, 0.1])

# Mock model predictions
perfect_model = np.eye(num_classes)[ground_truth]
noisy_model = normalize(
    perfect_model + 2 * np.random.random((num_samples, num_classes))
)
random_model = normalize(np.random.random((num_samples, num_classes)))

metriculous.compare_classifiers(
    ground_truth=ground_truth,
    model_predictions=[perfect_model, noisy_model, random_model],
    model_names=["Perfect Model", "Noisy Model", "Random Model"],
    class_names=class_names,
    one_vs_all_figures=True,
    # Filter out arbitrary figures by their name:
    filter_figures=lambda figure_title: "Automation Rate" not in figure_title,  # <--- YOUR FILTER
    # Sidenote: You can do the same for the metrics displayed in the table:
    filter_quantities=lambda name: "Accuracy" not in name,
).display()

As you've probably noticed the documentation isn't great, so let me also share this list of parameters of the compare_classifiers function, some of which might be useful for further customization of the output:

def compare_classifiers(
    ground_truth: ClassificationGroundTruth,
    model_predictions: Sequence[ClassificationPrediction],
    model_names: Optional[Sequence[str]] = None,
    sample_weights: Optional[Sequence[float]] = None,
    class_names: Optional[Sequence[str]] = None,
    one_vs_all_quantities: bool = True,
    one_vs_all_figures: bool = False,
    top_n_accuracies: Sequence[int] = (),
    filter_quantities: Optional[Callable[[str], bool]] = None,
    filter_figures: Optional[Callable[[str], bool]] = None,
    primary_metric: Optional[str] = None,
    simulated_class_distribution: Optional[Sequence[float]] = None,
    class_label_rotation_x: str = "horizontal",
    class_label_rotation_y: str = "vertical",
) -> Comparison:

    return compare(
        evaluator=ClassificationEvaluator(
            ...
        ),
        ...
    )

Let me know if you have any other questions.

jmrichardson · 2022-06-11T16:27:13Z

Hi @marlonjan ,

Works perfectly! Thanks so much for the fast reply. Metriculous html output works perfectly and with datapane as dp.HTML(html) block. Thanks again for making the available :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documentation automation rate analysis #2

Documentation automation rate analysis #2

davidggphy commented Aug 31, 2020

magnusja commented Sep 25, 2020

marlonjan commented Dec 30, 2020

jmrichardson commented Jun 10, 2022

marlonjan commented Jun 11, 2022

jmrichardson commented Jun 11, 2022

Documentation automation rate analysis #2

Documentation automation rate analysis #2

Comments

davidggphy commented Aug 31, 2020

magnusja commented Sep 25, 2020

marlonjan commented Dec 30, 2020

jmrichardson commented Jun 10, 2022

marlonjan commented Jun 11, 2022

jmrichardson commented Jun 11, 2022