sklearn的SVC评分方法需要什么样的输入?

 手机用户2502873393 发布于 2023-01-06 21:31

所以我正在尝试构建一个分类器并对其性能进行评分.这是我的代码:

def svc(train_data, train_labels, test_data, test_labels):
    from sklearn.svm import SVC
    from sklearn.metrics import accuracy_score
    svc = SVC(kernel='linear')
    svc.fit(train_data, train_labels)
    predicted = svc.predict(test_data)
    actual = test_labels
    score = svc.score(test_data, test_labels)
    print ('svc score')
    print (score)
    print ('svc accuracy')
    print (accuracy_score(predicted, actual))

现在当我运行函数svc(X,x,Y,y)时:

X.shape = (1000, 150)    
x.shape = (1000, )   
Y.shape = (200, 150)   
y.shape = (200, )

我收到错误:

      6     predicted = svc.predict(test_classed_data)
      7     actual = test_classed_labels
----> 8     score = svc.score(test_classed_data, test_classed_labels)
      9     print ('svc score')
     10     print (score)

local/lib/python3.4/site-packages/sklearn/base.py in score(self, X, y, sample_weight)
    289         """
    290         from .metrics import accuracy_score
--> 291         return accuracy_score(y, self.predict(X), sample_weight=sample_weight)
    292 
    293 

    124     if (y_type not in ["binary", "multiclass", "multilabel-indicator",
    125                        "multilabel-sequences"]):
--> 126         raise ValueError("{0} is not supported".format(y_type))
    127 
    128     if y_type in ["binary", "multiclass"]:

ValueError: continuous is not supported

事情是我的test_labels或y格式为:

[ 15.5  15.5  15.5  15.5  15.5  15.5  15.5  15.5  15.5  15.5  15.5  20.5
  20.5  20.5  20.5  20.5  20.5  20.5  20.5  20.5  20.5  20.5  25.5  25.5
  25.5  25.5  25.5  25.5  25.5  25.5  25.5  25.5  25.5  30.5  30.5  30.5
  30.5  30.5  30.5  30.5  30.5  30.5  30.5  30.5  35.5  35.5  35.5  35.5
  35.5  35.5  35.5  35.5  35.5  35.5  35.5... ]

我真的很困惑,为什么SVC不能将这些识别为离散标签,因为我所看到的所有示例都具有相似的格式以供使用并且工作正常.请帮忙.

1 个回答
  • y同时在fitscore职能应该是整数或字符串,代表类的标签.

    例如,如果您有两个类"foo"1,你可以训练一个SVM像这样:

    >>> from sklearn.svm import SVC
    >>> clf = SVC()
    >>> X = np.random.randn(10, 4)
    >>> y = ["foo"] * 5 + [1] * 5
    >>> clf.fit(X, y)
    SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3, gamma=0.0,
      kernel='rbf', max_iter=-1, probability=False, random_state=None,
      shrinking=True, tol=0.001, verbose=False)
    

    然后测试其准确性

    >>> X_test = np.random.randn(6, 4)
    >>> y_test = ["foo", 1] * 3
    >>> clf.score(X_test, y_test)
    0.5
    

    浮点值显然仍被接受fit,但它们不应该被接受,因为类标签不应该是实际值.

    2023-01-06 21:39 回答
撰写答案
今天,你开发时遇到什么问题呢?
立即提问
热门标签
PHP1.CN | 中国最专业的PHP中文社区 | PNG素材下载 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有