Let us assume 2x2 image, 3 channels (labels possibilities) and 2 batch data.
Then the predicted scores (logits) is given by:
Assume 2x2 array is arranged in a single row.
For N = 1 (of batch 2):
[ 9 2 5 8 ] Each row is a class
[ 7 8 4 6 ] Each column is score of scores of every class for a pixel
[ 8 4 6 7 ]
For N = 2 (of batch 2):
[ 7 3 0 3 ]
[ 4 6 1 1 ]
[ 2 4 4 2 ]
If we convert it to softmax scores (Caffe's Softmax does the same)
say a column is:
[ x
y
z ]
Then softmax score (which is calculated for every row/class/channel) is:
[ e^x / ( e^x + e^y + e^z )
e^y / ( e^x + e^y + e^z )
e^z / ( e^x + e^y + e^z ) ]
So softmax output will be:
[ 0.6652409 0.00242826 0.24472848 0.6652409 ]
[ 0.09003057 0.9796292 0.09003057 0.09003057 ]
[ 0.24472846 0.01794253 0.66524094 0.24472846 ]
[ 0.94649917 0.04201007 0.01714783 0.6652409 ]
[ 0.04712342 0.8437947 0.04661262 0.09003057 ]
[ 0.00637746 0.1141952 0.93623954 0.24472846 ]
Cross Entropy:
Lets consider the actual label is:
[ 2 1 2 0 ]
[ 0 1 2 2 ]
Cross Entropy Loss = - sum ( gt * loge (pred) )
Loss1 = - [ loge(0.24472846) + loge(0.9796292) + loge(0.66524094) + loge(0.6652409) ] / 4 = 2.2433988 / 4 = 0.5608497
Loss2 = - [ loge(0.94649917) + loge(0.8437947) + loge(0.93623954) + loge(0.24472846) ] / 4 = 1.6983210 / 4 = 0.4245803
Average loss over batch = 0.9854300 / 2 = 0.4927150
So for caffe:
Pred = N x C x H x W (C = number of classes) , so like one hot
Labl = N x 1 x H x W (Labels should be 0, 1, .. N)
Output is average log loss over pixels over batch.
import numpy as np
import caffe
ip = np.array([[[[9,2],[5,8]],[[7,8],[4,6]],[[8,4],[6,7]]], [[[7,3],[0,3]],[[4,6],[1,1]],[[2,4],[4,2]]]])
tl = np.array([[[[0,1],[2,0]]],[[[0,1],[2,0]]]])
fl = np.array([[[[2,1],[2,0]]],[[[0,1],[2,2]]]])
caffe.set_mode_cpu()
net = caffe.Net('softmax.py', caffe.TEST)
net.blobs['pred'].data[...] = ip#ip[1][None, :]
net.blobs['label'].data[...] = fl #fl[1][None, :]
net.forward()
sm=net.blobs['softmax'].data #[0]
print(sm.shape)
Prototxt:
layer {
name: "pred"
type: "Input"
top: "pred"
input_param {
shape {
dim: 2
dim: 3
dim: 2
dim: 2
}
}
}
layer {
name: "label"
type: "Input"
top: "label"
input_param {
shape {
dim: 2
dim: 1
dim: 2
dim: 2
}
}
}
layer {
name: "softmax"
type: "Softmax"
bottom: "pred"
top: "softmax"
}
layer {
name: "loss2"
type: "SoftmaxWithLoss"
bottom: "pred"
bottom: "label"
top: "loss2"
loss_weight: 1
}
# layer {
# name: "loss1"
# type: "MultinomialLogisticLoss"
# bottom: "softmax"
# bottom: "label"
# top: "loss1"
# loss_weight: 1
# }
If batch size is 2 but you give only 1 input batch, it is duplicated in output.
Then the predicted scores (logits) is given by:
Assume 2x2 array is arranged in a single row.
For N = 1 (of batch 2):
[ 9 2 5 8 ] Each row is a class
[ 7 8 4 6 ] Each column is score of scores of every class for a pixel
[ 8 4 6 7 ]
For N = 2 (of batch 2):
[ 7 3 0 3 ]
[ 4 6 1 1 ]
[ 2 4 4 2 ]
If we convert it to softmax scores (Caffe's Softmax does the same)
say a column is:
[ x
y
z ]
Then softmax score (which is calculated for every row/class/channel) is:
[ e^x / ( e^x + e^y + e^z )
e^y / ( e^x + e^y + e^z )
e^z / ( e^x + e^y + e^z ) ]
So softmax output will be:
[ 0.6652409 0.00242826 0.24472848 0.6652409 ]
[ 0.09003057 0.9796292 0.09003057 0.09003057 ]
[ 0.24472846 0.01794253 0.66524094 0.24472846 ]
[ 0.94649917 0.04201007 0.01714783 0.6652409 ]
[ 0.04712342 0.8437947 0.04661262 0.09003057 ]
[ 0.00637746 0.1141952 0.93623954 0.24472846 ]
Cross Entropy:
Lets consider the actual label is:
[ 2 1 2 0 ]
[ 0 1 2 2 ]
Cross Entropy Loss = - sum ( gt * loge (pred) )
Loss1 = - [ loge(0.24472846) + loge(0.9796292) + loge(0.66524094) + loge(0.6652409) ] / 4 = 2.2433988 / 4 = 0.5608497
Loss2 = - [ loge(0.94649917) + loge(0.8437947) + loge(0.93623954) + loge(0.24472846) ] / 4 = 1.6983210 / 4 = 0.4245803
Average loss over batch = 0.9854300 / 2 = 0.4927150
So for caffe:
Pred = N x C x H x W (C = number of classes) , so like one hot
Labl = N x 1 x H x W (Labels should be 0, 1, .. N)
Output is average log loss over pixels over batch.
import numpy as np
import caffe
ip = np.array([[[[9,2],[5,8]],[[7,8],[4,6]],[[8,4],[6,7]]], [[[7,3],[0,3]],[[4,6],[1,1]],[[2,4],[4,2]]]])
tl = np.array([[[[0,1],[2,0]]],[[[0,1],[2,0]]]])
fl = np.array([[[[2,1],[2,0]]],[[[0,1],[2,2]]]])
caffe.set_mode_cpu()
net = caffe.Net('softmax.py', caffe.TEST)
net.blobs['pred'].data[...] = ip#ip[1][None, :]
net.blobs['label'].data[...] = fl #fl[1][None, :]
net.forward()
sm=net.blobs['softmax'].data #[0]
print(sm.shape)
Prototxt:
layer {
name: "pred"
type: "Input"
top: "pred"
input_param {
shape {
dim: 2
dim: 3
dim: 2
dim: 2
}
}
}
layer {
name: "label"
type: "Input"
top: "label"
input_param {
shape {
dim: 2
dim: 1
dim: 2
dim: 2
}
}
}
layer {
name: "softmax"
type: "Softmax"
bottom: "pred"
top: "softmax"
}
layer {
name: "loss2"
type: "SoftmaxWithLoss"
bottom: "pred"
bottom: "label"
top: "loss2"
loss_weight: 1
}
# layer {
# name: "loss1"
# type: "MultinomialLogisticLoss"
# bottom: "softmax"
# bottom: "label"
# top: "loss1"
# loss_weight: 1
# }
If batch size is 2 but you give only 1 input batch, it is duplicated in output.