bcubed函数里面，根据我的验证，precision和recall两个变量可能搞反了 #43

FDuiv · 2021-02-04T10:23:46Z

代码中coo_matrix返回的混淆矩阵axis=0的轴是指预测类别
错误的代码：
precision = np.sum(cm_norm * (cm / cm.sum(axis=0)))
recall = np.sum(cm_norm * (cm / np.expand_dims(cm.sum(axis=1), 1)))
正确的代码：
recall = np.sum(cm_norm * (cm / cm.sum(axis=0)))
precision= np.sum(cm_norm * (cm / np.expand_dims(cm.sum(axis=1), 1)))
但f1又恰好正确。
幸好论文里面没有打印这些数据。

下面是分析：我以BCubed的论文例子做实验

pred=np.array([0,0,0,0,1,1,1,2,2,2,2,2,2,2])
label=np.array([0,0,0,0,0,1,1,2,1,3,4,1,1,1])
cm= coo_matrix(
(np.ones((14)), (pred, label)),
shape=(3, 5),
dtype=np.int),toarray()
“”“
[[4, 0, 0, 0, 0],
[1, 2, 0, 0, 0],
[0, 4, 1, 1, 1]]
”“”
np.expand_dims(cm.sum(axis=1), 1)
”“”
[[4],
[3],
[7]]
“”“
cm / np.expand_dims(cm.sum(axis=1),1)
“”“
[[1. , 0. , 0. , 0. , 0. ],
[0.33333333, 0.66666667, 0. , 0. , 0. ],
[0. , 0.57142857, 0.14285714, 0.14285714, 0.14285714]]
”“”
cm * cm / np.expand_dims(cm.sum(axis=1),1)
"""
[[4. , 0. , 0. , 0. , 0. ],
[0.33333333, 1.33333333, 0. , 0. , 0. ],
[0. , 2.28571429, 0.14285714, 0.14285714, 0.14285714]]
"""

np.sum(cm * cm / np.expand_dims(cm.sum(axis=1),1))/cm.sum()

0.5986394557823128

这和你的代码是一样的算法，但是变量名错了：
cm_norm = cm / cm.sum()
recall = np.sum(cm_norm * (cm / np.expand_dims(cm.sum(axis=1), 1)))
这应该是精度
precision:(44/4+1/3+22/3+31/7+44/7)/14 = 0.5986394557823128

显然应该要除以每个预测类的总数
np.expand_dims(cm.sum(axis=1), 1)
”“”
[[4],
[3],
[7]]
“”“
axis=1才是计算精度的轴。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bcubed函数里面，根据我的验证，precision和recall两个变量可能搞反了 #43

bcubed函数里面，根据我的验证，precision和recall两个变量可能搞反了 #43

FDuiv commented Feb 4, 2021

bcubed函数里面，根据我的验证，precision和recall两个变量可能搞反了 #43

bcubed函数里面，根据我的验证，precision和recall两个变量可能搞反了 #43

Comments

FDuiv commented Feb 4, 2021

0.5986394557823128