100字范文 > 混淆矩阵评价指标_机器学习：模型训练和评估——分类效果的评价

混淆矩阵评价指标_机器学习：模型训练和评估——分类效果的评价

时间：2021-02-18 23:51:36

图 | 源网络文 | 5号程序员

分类模型建立好后，这个模型到底符不符合要求要怎么评判呢？

事实上是有评价标准的。

要评价模型在测试集上预测结果的好坏，可以使用Sklearn库中的metrics模块方法进行计算。

常用的评价方式见下表：

更多方式可以参考网址：/sources/py/sklearn.metrics.html

为了更好的理解，下方将通过实例进行方法讲解。

首先，导入相应的库和鸢尾花数据集，然后找到K近邻参数：

import numpy as npimport pandas as pdimport matplotlib.pyplot as pltfrom mpl_toolkits.mplot3d import Axes3Dimport seaborn as sns%matplotlib inline%config InlineBackend.figure_format = "retina"from matplotlib.font_manager import FontPropertiesfonts = FontProperties(fname="C:\Windows\Fonts\SimHei.ttf", size=14)from sklearn import metricsfrom sklearn.model_selection import train_test_splitfrom sklearn.datasets import load_irisfrom sklearn.pipeline import Pipelinefrom sklearn.preprocessing import StandardScalerfrom sklearn.model_selection import GridSearchCVfrom sklearn.neighbors import KNeighborsClassifier Iris = load_iris()train_x,test_x,train_y,test_y = train_test_split(Iris.data,Iris.target,test_size=0.25,random_state=2)pipe_KNN = Pipeline([("scale",StandardScaler()),("KNN",KNeighborsClassifier())])n_neighbors = np.arange(1,10)para_grid = [{"scale__with_mean":[True,False],"KNN__n_neighbors":n_neighbors}]gs_KNN_ir = GridSearchCV(estimator=pipe_KNN,param_grid=para_grid,cv=10,n_jobs=4)gs_KNN_ir.fit(train_x,train_y)gs_KNN_ir.best_params_

得到如图所示的结果：

混淆矩阵

接下来将通过confusion_matrix(真实类别，预测类别)来计算模型混淆矩阵，并将其可视化，使用的模型为通过网格搜索最优参数的K近邻分类模型。

在可视化时，使用sns.heatmap()绘制热力图。

先查看结果数：

pd.value_counts(test_y)

然后输出混淆矩阵：

Iris_clf = gs_KNN_ir.best_estimator_prey = Iris_clf.predict(test_x)metrics.confusion_matrix(test_y,prey)

最后将混淆矩阵可视化：

confm = metrics.confusion_matrix(test_y,prey)heatmap = sns.heatmap(confm.T,square=True,annot=True,fmt="d",cbar=False,cmap=plt.cm.gray_r)bottom,top = heatmap.get_ylim()heatmap.set_ylim(bottom+ 0.5, top - 0.5)plt.xlabel("Truelabel")plt.ylabel("Predictedlabel")plt.show()

从结果可以看出，热力图比直接看混淆矩阵的输出更方便，有更强的视觉对比度。

F1得分

使用classification_report()将会得到一个包含主要评价方法结果的报告。

下方是对K近邻预测值进行评价：

print(metrics.classification_report(test_y,prey))

针对上面的输出，各项含义如下表所示：

AUC和ROC曲线

很多学习器是为测试样本产生一个实值或者概率预测，然后将这个预测值与一个分类阈值进行比较，如果大于阈值则分为正类，否则为反类。

比如在朴素贝叶斯分类器中，针对每一个测试样本预测出一个[0,1]之间的概率，然后将这个值与0.5比较，如果大于0.5则判断为正类，否则为反类。

阈值的好坏直接反映了学习算法的泛化能力。

根据预测值的概率，可以使用受试者工作特征(ROC)曲线来分析机器学习算法的泛化性能。

在ROC曲线中，纵轴是真正例率(True positive rate)，横轴是假正例率(False positive rate)。