使用训练好的sklearn svm.SVC的模型,去对测试数据预测的时候报错了
2022/04/11

  • 代码:
tfidfTrain = TfidfVectorizer(stop_words=stopWordList
                             ,min_df= 0.02
                            ).fit_transform(dataTrain_list)

tfidfTest = TfidfVectorizer(stop_words=stopWordList
                            ,min_df= 0.02
                            ,max_features = 605 #之前没加也是同样的报错
                            #觉得是训练集和测试集的特征数量不一样才报错,
                            #打算控制测试集的特征和训练集一样,
                            #结果还是报错
                           ).fit_transform(dataTest_list)

#用训练集训练模型
model = OneVsRestClassifier(svm.SVC(kernel='linear'))
clf = model.fit(tfidfTrain, labelTrain_list)

ytest_pred = clf.predict(tfidfTest)
  • 报错:
    ValueError: X has 597 features, but SVC is expecting 605 features as input.
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-14-8d3a4697f978> in <module>()
     30 
     31 print("对测试集进行模型评估:")
---> 32 ytest_pred = clf.predict(tfidfTest)
     33 print("classification_report: ")
     34 print(classification_report(labelTest_list, ytest_pred))

~\AppData\Roaming\Python\Python37\site-packages\sklearn\multiclass.py in predict(self, X)
    441             argmaxima = np.zeros(n_samples, dtype=int)
    442             for i, e in enumerate(self.estimators_):
--> 443                 pred = _predict_binary(e, X)
    444                 np.maximum(maxima, pred, out=maxima)
    445                 argmaxima[maxima == pred] = i

~\AppData\Roaming\Python\Python37\site-packages\sklearn\multiclass.py in _predict_binary(estimator, X)
     98         return estimator.predict(X)
     99     try:
--> 100         score = np.ravel(estimator.decision_function(X))
    101     except (AttributeError, NotImplementedError):
    102         # probabilities of the positive class

~\AppData\Roaming\Python\Python37\site-packages\sklearn\svm\_base.py in decision_function(self, X)
    754         transformation of ovo decision function.
    755         """
--> 756         dec = self._decision_function(X)
    757         if self.decision_function_shape == "ovr" and len(self.classes_) > 2:
    758             return _ovr_decision_function(dec < 0, -dec, len(self.classes_))

~\AppData\Roaming\Python\Python37\site-packages\sklearn\svm\_base.py in _decision_function(self, X)
    512         # NOTE: _validate_for_predict contains check for is_fitted
    513         # hence must be placed before any other attributes are used.
--> 514         X = self._validate_for_predict(X)
    515         X = self._compute_kernel(X)
    516 

~\AppData\Roaming\Python\Python37\site-packages\sklearn\svm\_base.py in _validate_for_predict(self, X)
    596                 order="C",
    597                 accept_large_sparse=False,
--> 598                 reset=False,
    599             )
    600 

~\AppData\Roaming\Python\Python37\site-packages\sklearn\base.py in _validate_data(self, X, y, reset, validate_separately, **check_params)
    583 
    584         if not no_val_X and check_params.get("ensure_2d", True):
--> 585             self._check_n_features(X, reset=reset)
    586 
    587         return out

~\AppData\Roaming\Python\Python37\site-packages\sklearn\base.py in _check_n_features(self, X, reset)
    399         if n_features != self.n_features_in_:
    400             raise ValueError(
--> 401                 f"X has {n_features} features, but {self.__class__.__name__} "
    402                 f"is expecting {self.n_features_in_} features as input."
    403             )

ValueError: X has 597 features, but SVC is expecting 605 features as input.

还没好好看源码,先记录一下

2022/04/12:

  • 解决:

把测试数据的fit_transform()改为transform()就可以了把测试数据的fit_transform()改为transform()就可以了

查找解决方法过程:

点击阅读全文
Logo

华为开发者空间,是为全球开发者打造的专属开发空间,汇聚了华为优质开发资源及工具,致力于让每一位开发者拥有一台云主机,基于华为根生态开发、创新。

更多推荐