pandas - 多项式NB中系数的作用

翻译自：https://stackoverflow.com/questions/48350403 2018-01-19T22:23:42.167

359 次

我的 MultinomialNB 分类器在矢量化的假/真新闻文章上进行了实例化和训练，现在我正试图理解系数背后的含义。

nb_classifier = MultinomialNB()

# Extracting the class labels: ('Fake' or 'Real')
class_labels = nb_classifier.classes_

# Extract the features_names from the vectorizer I used
feature_names = count_vectorizer.get_feature_names()

# Zip the feature_names together with the coefficient array and sort by weights
feat_with_weights = sorted(zip(nb_classifier.coef_[0], feature_names))

print(class_labels[0], feat_with_weights[-20:]) #Or, class_labels[1] = 'Real'

结果是：

假[（-6.2632792078858461，'sanders'），（-6.2426599206831099，'house'），（-6.1832365002123097，'senate'） , '共和党人'),...]

我了解较高的系数 (-5.9) 意味着令牌比 -6.2 具有更高的预测性。但我不确定关系在哪里。这是否意味着令牌“共和党人”与假新闻或真实新闻高度相关。

pandas - 多项式NB中系数的作用

0 回答 0

Related

Reference