【实验分析】Evaluation Matrics

因为最近实验歇一阵，所以有时间给我停一下去看一下论文和代码。评估指标的话之前刚来的时候也看过一些，但是那时候也就是看看，也没有写代码，也没有用起来，所以其实后来就忘了，用的时候也就是在用师姐的代码，理解也不够，有时候指标结果不好也说不清为什么。所以现在趁着有机会看一下这些指标。
这里参照了Zhang的论文A review on multi-label learning algorithms和Yang的论文Relevant Emotion Ranking from Text Constrained with Emotion Relationships。

Ranking Loss

将每个样本得到的结果的相关标签的得分与不相关标签的得分两两比较，统计相关标签的得分<不相关标签的得分的平均次数。
$$rankingLoss=\frac{1}{n}\sum_{i=1}^{n}\sum_{(e_t,e_s)\in R_i\times \overline{R_i}}{\frac{\delta [g_t(x_i) < g_s(x_i)]}{|R_i|\times |\overline{R_i}|}}$$
其中关于相关标签与不相关标签得分相等的部分，Zhang是直接算为1，Yang是算为0，下面的matlab代码中算为了0.5。
matlab代码：

function retValue = rankingLoss(output,test_rank,test_length)
%rankingLoss returns the ranking loss for the output
%
%    Syntax
%
%       retValue=subsetAccuracy(output,test_target,test_length)
%
%    Description
%
%       Average Precision takes,
%           output           - An mx(n+1) array. The output of the ith testing instance on the jth class is stored in output(i,j). output(i,n+1) stores the output for the threshold label.
%           test_rank        - An mxm array, for the i-th test instance, the larger train_rank(i,j) is ,the higher the jth label ranked
%           test_length      - An mx1 array, the i-th test instance has train_length(i,1) relevant labels.  
%           
%      and returns,
%			retValue         - A Value represent the ranking loss

[m,n]=size(test_rank);

retValue=0;
for i=1:m
    tmpValue=0;
    irIndex=test_rank(i,test_length(i)+1:n);
    irOutput=output(i,irIndex);
    for j=1:test_length(i)
        tmpValue=tmpValue+sum(irOutput>output(i,test_rank(i,j)));
        % tmpValue=tmpValue+sum(irOutput==output(i,test_rank(i,j)))*1/2;
    end
    if test_length(i)~=0 && test_length(i)~=n
        tmpValue=tmpValue/test_length(i)/(n-test_length(i));
    end
    retValue=retValue+tmpValue;
end
retValue=retValue/m;
end

Pro Loss

将所有的标签分组，取出其中四组(relevant, relevant)、(relevant, irrelevant)、 (relevant, threshold)和(threshold, irrelevant)。其中的每组按照类似于ranking loss的方法分别计算loss（相等时取1/2），并取平均值作为最终的loss。
$$proLoss=\frac{1}{n}\sum_{e_t\in R_i \cup \lbrace \Theta \rbrace}{\sum_{e_s\in\prec(e_t)}{\frac{1}{norm_{t,s}}l_{t,s}}}$$
其中$l_{t,s}$是一个改进的0-1损失，$norm_{t,s}$是对$(t,s)$的集合大小。
matlab代码：

function retValue = proLoss(output,test_rank,test_length)
%ProLoss returns the Pro Loss and Ranking Loss for the output
%
%    Syntax
%
%       [sumLoss,Loss3]=ProLoss(test_length,test_rank,output)
%
%    Description
%
%       Average Precision takes,
%           test_rank        - An mx(n+1) array, for the i-th test instance, the larger train_rank(i,j) is ,the higher the jth label ranked
%           test_length      - An mx1 array, the i-th test instance has train_length(i,1) relevant labels.  
%           output           - An mx(m+1) array. The output of the ith testing instance on the jth class is stored in output(i,j). output(i,m+1) stores the output for the threshold label.
%      and returns,
%			retValue          - A Value represent the PRO Loss

[m,n]=size(test_rank);
loss=zeros(1,4);

%(relevant, relevant)
for i=1:m
    tmpValue=0;
    for j=1:test_length(i)
        rIndex=test_rank(i,j+1:test_length(i));
        rOutput=output(i,rIndex);
        tmpValue=tmpValue+sum(rOutput>output(i,test_rank(i,j)));
        tmpValue=tmpValue+sum(rOutput==output(i,test_rank(i,j)))/2;
    end
    if test_length(i)~=0 && test_length(i)~=1
        tmpValue=tmpValue*2/test_length(i)/(test_length(i)-1);
    end
    loss(1)=loss(1)+tmpValue;
end

%(relevant, irrelevant)
for i=1:m
    tmpValue=0;
    irIndex=test_rank(i,test_length(i)+1:n);
    irOutput=output(i,irIndex);
    for j=1:test_length(i)
        tmpValue=tmpValue+sum(irOutput>output(i,test_rank(i,j)));
        tmpValue=tmpValue+sum(irOutput==output(i,test_rank(i,j)))/2;
    end
    if test_length(i)~=0 && test_length(i)~=n
        tmpValue=tmpValue/test_length(i)/(n-test_length(i));
    end
    loss(2)=loss(2)+tmpValue;
end

%(relevant, threshold)
for i=1:m
    tmpValue=0;
    rIndex=test_rank(i,1:test_length(i));
    rOutput=output(i,rIndex);
    threshold=output(i,n+1);
    tmpValue=tmpValue+sum(rOutput<threshold);
    tmpValue=tmpValue+sum(rOutput==threshold)/2;
    if test_length(i)~=0
        tmpValue=tmpValue/test_length(i);
    end
    loss(3)=loss(3)+tmpValue;
end

%(threshold, irrelevant)
for i=1:m
    tmpValue=0;
    irIndex=test_rank(i,test_length(i)+1:n);
    irOutput=output(i,irIndex);
    threshold=output(i,n+1);
    tmpValue=tmpValue+sum(irOutput>threshold);
    tmpValue=tmpValue+sum(irOutput==threshold)/2;
    if test_length(i)~=n
        tmpValue=tmpValue/(n-test_length(i));
    end
    loss(4)=loss(4)+tmpValue;
end

retValue=mean(loss)/m;
end

Hamming Loss

$$hammingLoss=\frac{1}{nT}\sum_{i=1}^{n}|{\hat{R}}_i \bigtriangleup R_i|$$
matlab代码：

function retValue = hammmingLoss(output,test_target,test_length)
%HammingLoss returns the Hamming Loss for the output
% 
%    Syntax
%
%       retValue=subsetAccuracy(output,test_target,test_length)
%
%    Description
%
%       Average Precision takes,
%           output           - An mx(n+1) array. The output of the ith testing instance on the jth class is stored in output(i,j). output(i,n+1) stores the output for the threshold label.
%           test_target      - An mx1 array, if the i-th instance belong to the jth class, test_target(j,i)=1, otherwise test_target(j,i)=-1
%           test_length      - An mx1 array, the i-th test instance has train_length(i,1) relevant labels.  
%           
%      and returns,
%			retValue         - A Value represent the Hamming Loss
[m,n]=size(output);
threshold=output(:,n);
target=(output(:,1:n-1)>repmat(threshold,[1,n-1]));
notEqual=xor(target, test_target);
retValue=sum(sum(notEqual))/m/(n-1);
end

Example F1

$$exampleF1=\frac{1}{n}\sum_{i=1}^{n}2\frac{|R_i \cap {\hat{R}}_i|}{|R_i|+| {\hat{R}}_i|}$$
matlab代码：

function retValue = exampleF1(output,test_target,test_length)
%exampleF1 returns the example F1 for the output
%
%    Syntax
%
%       retValue = exampleF1(output,test_target,test_length)
%
%    Description
%
%       Average Precision takes,
%           output           - An mx(n+1) array. The output of the ith testing instance on the jth class is stored in output(i,j). output(i,n+1) stores the output for the threshold label.
%           test_target      - An mx1 array, if the i-th instance belong to the jth class, test_target(j,i)=1, otherwise test_target(j,i)=-1
%           test_length      - An mx1 array, the i-th test instance has train_length(i,1) relevant labels.  
%
%      and returns,
%			retValue         - A Value represent the exampleF1

[m,n]=size(test_target);
threshold=output(:,n+1);
target=(output(:,1:n)>repmat(threshold,[1,n]));
num=sum(target,2);
retValue=sum(target&test_target,2)./(test_length+num);
retValue=2*sum(retValue)/m;
end

Micro F1

不分类别，针对每一个样本。
$$microF1=F1(\sum_{t=1}^{T}{TP_t},\sum_{t=1}^{T}{FP_t},\sum_{t=1}^{T}{TN_t},\sum_{t=1}^{T}{FN_t})$$
matlab代码：

function retValue = microF1(output,test_target)
%microF1 returns the microF1 for the output
% 
%    Syntax
%
%       retValue=microF1(output,test_target)
%
%    Description
%
%       Average Precision takes,
%           output           - An mx(n+1) array. The output of the ith testing instance on the jth class is stored in output(i,j). output(i,n+1) stores the output for the threshold label.
%           test_target      - An mx1 array, if the i-th instance belong to the jth class, test_target(j,i)=1, otherwise test_target(j,i)=-1
%           
%      and returns,
%			retValue         - A Value represent the microF1

[m,n]=size(test_target);
threshold=output(:,n+1);
target=(output(:,1:n)>repmat(threshold,[1,n]));

TP = sum(sum(target&test_target));
FP = sum(sum(target&(~test_target)));
FN = sum(sum((~target)&test_target));

beta=1;
if TP + FP + FN == 0
   retValue = 0;
else
   retValue = ((beta*beta + 1) * TP) / ((beta*beta + 1) * TP + beta*beta * FN + FP);
end
end

Macro F1

针对每个类别分别计算，然后取平均。
$$macroF1=\frac{1}{T}\sum_{t=1}^{T}F1(TP_t,FP_t,TN_t,FN_t)$$
matlab代码：

function retValue = macroF1(output,test_target)
%macroF1 returns the macroF1 for the output
% 
%    Syntax
%
%       retValue=macroF1(output,test_target,test_length)
%
%    Description
%
%       Average Precision takes,
%           output           - An mx(n+1) array. The output of the ith testing instance on the jth class is stored in output(i,j). output(i,n+1) stores the output for the threshold label.
%           test_target      - An mx1 array, if the i-th instance belong to the jth class, test_target(j,i)=1, otherwise test_target(j,i)=-1
%           
%      and returns,
%			retValue         - A Value represent the macroF1

[m,n]=size(test_target);
threshold=output(:,n+1);
target=(output(:,1:n)>repmat(threshold,[1,n]));

TP = sum(target&test_target,1);
FP = sum(target&(~test_target),1);
FN = sum((~target)&test_target,1);

S=TP+FP+FN;

beta=1;
retValue(S==0)=0;
retValue(S>0)=((beta*beta + 1) * TP) ./ ((beta*beta + 1) * TP + beta*beta * FN + FP);
retValue=mean(retValue);
end

Subset Accuracy

输出多标签和原多标签完全一致的样本所占百分比。
$$subsetAccuracy=\frac{1}{n}\sum_{i=1}^{n}{\delta[{\hat{R}}_i=R_i]}$$
matlab代码：

function retValue = subsetAccuracy(output,test_target,test_length)
%SubsetAccuracy returns the Subset Accuracy for the output
%
%    Syntax
%
%       subAcc=subsetAccuracy(output,test_target,test_length)
%
%    Description
%
%       Average Precision takes,
%           output           - An mx(n+1) array. The output of the ith testing instance on the jth class is stored in output(i,j). output(i,n+1) stores the output for the threshold label.
%           test_target      - An mx1 array, if the i-th instance belong to the jth class, test_target(j,i)=1, otherwise test_target(j,i)=-1
%           test_length      - An mx1 array, the i-th test instance has train_length(i,1) relevant labels.  
%           
%      and returns,
%			retValue         - A Value represent the Subset Accuracy

[m,n]=size(output);
threshold=output(:,n);
target=(output(:,1:n-1)>repmat(threshold,[1,n-1]));
equal1=(sum(target,2)==test_length);
equal2=(sum(target&test_target,2)==test_length);
retValue=sum(equal1&equal2)/m;
end

One Error

输出得分最高的类别不是相关类别的样本所占百分比。
$$oneError=\frac{1}{n}\sum_{i=1}^{n}{\delta[argmax \lbrace g_t(x_i)\not \in R_i\rbrace]}$$
matlab代码：

function retValue = oneError(output,test_target)
%oneError returns the one error for the output
%
%    Syntax
%
%       retValue = oneError(output,test_rank)
%
%    Description
%
%       Average Precision takes,
%           output           - An mx(n+1) array. The output of the ith testing instance on the jth class is stored in output(i,j). output(i,n+1) stores the output for the threshold label.
%           test_target      - An mx1 array, if the i-th instance belong to the jth class, test_target(j,i)=1, otherwise test_target(j,i)=-1
%           
%      and returns,
%			retValue         - A Value represent the one error

[m,n]=size(test_target);
threshold=output(:,n+1);
[~,ind]=max(output(:,1:n)');

retValue=0;
for i=1:m
    if test_target(i,ind(i))==0
        retValue=retValue+1;
    end
end
retValue=retValue/m;
end

Average Precision

排在相关标签前的标签也是相关标签的比例。
$$averagePrecision=\frac{1}{n}\sum_{i=1}^{n}{\frac{1}{|R_i|}\times\frac{\sum_{t:e_t\in R_i}{|\lbrace e_s\in R_i|g_s(x_i)>g_t(x_i)\rbrace|}}{|\lbrace e_s|g_s(x_i)>g_t(x_i)\rbrace|}}$$
其中$l_{t,s}$是一个改进的0-1损失，$norm_{t,s}$是对$(t,s)$的集合大小。
matlab代码：

function retValue = averagePrecision(output,test_target,test_length)
%averagePrecision returns the average precision for the output
%
%    Syntax
%
%       retValue = averagePrecision(output,test_target)
%
%    Description
%
%       Average Precision takes,
%           output           - An mx(n+1) array. The output of the ith testing instance on the jth class is stored in output(i,j). output(i,n+1) stores the output for the threshold label.
%           test_target      - An mx1 array, if the i-th instance belong to the jth class, test_target(j,i)=1, otherwise test_target(j,i)=-1
%           
%      and returns,
%			retValue         - A Value represent the average precision

[m,n]=size(test_target);
[~,index]=sort(output(:,1:n)*-1,2);

for i=1:m
    targetSort(i,:)=test_target(i,index(i,:));
end
targetFilp=fliplr(targetSort);

[~,index]=max(targetFilp');
index=index';
rank=n-index+1;
depth=rank-test_length;
index=1:m;
index=(index(depth~=0))';
retValue=m-length(index);
for i=1:length(index)
   ind=index(i);
   count=0;
   dep=0;
   for j=1:rank(ind)
       if targetSort(ind,j)==1
           count=count+1;
           dep=dep+count/j;
       end
   end
   retValue=retValue+dep/test_length(ind);
end
retValue=retValue/m;
end

Coverage

覆盖所有相关标签的平均搜索深度。
$$coverage=\frac{1}{n}\sum_{i=1}^{n}\max_{t:e_t\in{R_i}}|\lbrace e_s|g_s(x_i)>g_t(x_i)\rbrace|$$
matlab代码：

function retValue = coverage(output,test_target)
%oneError returns the coverage for the output
%
%    Syntax
%
%       retValue = coverage(output,test_target)
%
%    Description
%
%       Average Precision takes,
%           output           - An mx(n+1) array. The output of the ith testing instance on the jth class is stored in output(i,j). output(i,n+1) stores the output for the threshold label.
%           test_target      - An mx1 array, if the i-th instance belong to the jth class, test_target(j,i)=1, otherwise test_target(j,i)=-1
%           
%      and returns,
%			retValue         - A Value represent the coverage

[m,n]=size(test_target);
[~,index]=sort(output(:,1:n)*-1,2);

for i=1:m
    targetSort(i,:)=test_target(i,index(i,:));
end
targetFilp=fliplr(targetSort);

[~,index]=max(targetFilp');
index=index';
depth=n-index;
retValue=mean(depth);
end