![]() ![]() Every time researchers use an algorithm to discriminate the elements of a dataset having two conditions (for example, positive and negative), they can generate a contingency table called two-class confusion matrix representing how many elements were correctly predicted and how many were wrongly classified. The same cannot be said for balanced accuracy, markedness, bookmaker informedness, accuracy and F 1 score.Įvaluating the results of a binary classification remains an important challenge in machine learning and computational statistics. A Matthews correlation coefficient close to +1, in fact, means having high values for all the other confusion matrix metrics. Except in these cases, we believe that MCC is the most informative among the single metrics discussed, and suggest it as standard measure for scientists of all fields. We explain the mathematical relationships between MCC and these indicators, then show some use cases and a bioinformatics scenario where these metrics disagree and where MCC generates a more informative response.Additionally, we describe three exceptions where BM can be more appropriate: analyzing classifications where dataset prevalence is unrepresentative, comparing classifiers on different datasets, and assessing the random guessing level of a classifier. We compare MCC to other metrics which value positive and negative cases equally: balanced accuracy (BA), bookmaker informedness (BM), and markedness (MK). The scientific community has not agreed on a general-purpose statistical indicator for evaluating two-class confusion matrices (having true positives, true negatives, false positives, and false negatives) yet, even if advantages of the Matthews correlation coefficient (MCC) over accuracy and F 1 score have already been shown.In this manuscript, we reaffirm that MCC is a robust metric that summarizes the classifier performance in a single value, if positive and negative cases are of equal importance. Evaluating binary classifications is a pivotal task in statistics and machine learning, because it can influence decisions in multiple areas, including for example prognosis or therapies of patients in critical conditions. ![]()
0 Comments
Leave a Reply. |