Why Loss and Accuracy Metrics Conflict?

A loss function is used to optimize a machine learning algorithm. An accuracy metric is used to measure the algorithm’s performance (accuracy) in an interpretable way. It goes against my intuition that these two sometimes conflict: loss is getting better while accuracy is getting worse, or vice versa.

I’m working on a classification problem and once again got these conflicting results on the validation set.

Loss vs Accuracy

Accuracy (orange) finally rises to a bit over 90%, while the loss (blue) drops nicely until epoch 537 and then starts deteriorating. Around epoch 50 there’s a strange drop in accuracy even though the loss is smoothly and quickly getting better.

My loss function here is categorical cross-entropy that is used to predict class probabilities. The target values are one-hot encoded so the loss is the best when the model’s output is very close to 1 for the right category and very close to 0 for other categories. The loss is a continuous variable.

Accuracy or more precisely categorical accuracy gets a discrete true or false value for a particular sample.

It’s easy to construct a concrete test case showing conflicting values. Case 1 has less error but worse accuracy than case 2:

Loss vs Accuracy example spreadsheet

For reference, calculating categorical cross-entropy in Keras for one sample:

truth = K.variable([[1., 0., 0.]])
prediction = K.variable([[.50, .25, .25]])
loss = K.eval(K.categorical_crossentropy(truth, prediction))

In what kind of situations does loss-vs-accuracy discrepancy occur?

When the predictions get more confident, loss gets better even though accuracy stays the same. The model is thus more robust, as there’s a wider margin between classes.
If the model becomes over-confident in its predictions, a single false prediction will increase the loss unproportionally compared to the (minor) drop in accuracy. An over-confident model can have good accuracy but bad loss. I’d assume over-confidence equals over-fitting.
Imbalanced distributions: if 90% of the samples are “apples”, then the model would have good accuracy score if it simply predicts “apple” every time.

Accuracy metric is easier to interprete, at least for categorical training data. Accuracy however isn’t differentiable so it can’t be used for back-propagation by the learning algorithm. We need a differentiable loss function to act as a good proxy for accuracy.

14 Comments

Sandeep Hooda

January 14, 2019 at 4:42 pm

Wow no one explained it that great.
Jussi Huotari

January 15, 2019 at 5:43 pm

Thanks, Sandeep! 😊
rajubjc

April 16, 2019 at 4:02 pm

Beautiful explanation!
Maged Rawash

April 16, 2019 at 10:04 pm

thanks a lot it helps me understand the difference : )
Seyoung

June 11, 2019 at 3:39 pm

Kiitos paljon hienosta selityksestä!
joe

June 13, 2019 at 8:47 am

Thanks.
Can you link references and some resource from paper or book ?
Thanks
Shuvo

July 21, 2019 at 7:39 am

Neat !
John Doe

July 28, 2019 at 7:24 pm

As an ML novice, I was tinkering with a ConvNet in keras and I asked myself, “what would happen if I over-train this model?”, I knew it would overfit, but I wanted to see how that would look like. I expected to see loss go up, but I didn’t expect accuracy to stay the same. This article succinctly tells me why that is. Thanks for this!
Jussi Huotari

August 13, 2019 at 9:47 am

Sounds familiar :) Testing and tinkering is a great way to learn. Thanks for the comment.
Pablo Ariel Fahnle

August 16, 2019 at 9:20 pm

Amazing example, thanks!
Tethys

March 5, 2020 at 9:29 pm

So, what if both the training and validation loss decrease but accuracy stay around the same level (let’s say 60%). In such a case, what’s the reason causing it?

What will the result be if we continue to training more epochs? Will the loss stop decreasing at some point but the accuracy still stick around the same level?

Does this means the network is incapable of doing this task?
Alex

May 2, 2020 at 9:05 pm

Super thanks !
Kasjan

May 23, 2020 at 12:53 pm

Great post, understood it here first :)
thanks to the table, it is super clear!
Pingback: Loss vs Accuracy - 모두의연구소

Jussi Huotari's Web

Why Loss and Accuracy Metrics Conflict?

14 Comments

Leave a Reply