我正在为具有 25 个特征和 12 个类标签的数据集使用 Tensorflow DNNClassifier。
我现在的代码是:
def main():
training_set = tf.contrib.learn.datasets.base.load_csv_without_header(filename=TRAINING,target_dtype=np.int,features_dtype=np.float32)
test_set = tf.contrib.learn.datasets.base.load_csv_without_header(filename=TEST,target_dtype=np.int,features_dtype=np.float32)
feature_columns = [tf.contrib.layers.real_valued_column("", dimension=25)]
classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,activation_fn = tf.nn.relu,hidden_units=[50, 50, 50, 50, 50,50, 50, 50, 50, 50],optimizer=tf.train.AdamOptimizer(0.01),n_classes=12, model_dir="./model/")
def get_train_inputs():
x = tf.constant(training_set.data)
y = tf.constant(training_set.target)
return x, y
classifier.fit(input_fn=get_train_inputs, steps=1000)
def get_test_inputs():
x = tf.constant(test_set.data)
y = tf.constant(test_set.target)
return x, y
accuracy_score = classifier.evaluate(input_fn=get_test_inputs,steps=1)["accuracy"]
print("\n\nTest Accuracy: {0:f}\n".format(accuracy_score))
def new_samples():
with open(PREDICT,"r") as f:
contents = f.readlines()
arr = []
for c in contents:
arr.append([float(x.strip()) for x in c.split(",")])
return np.array(arr, dtype=np.float32)
predicted_classes = list(classifier.predict(input_fn=new_samples))
print("Predicted class: {}\n".format(predicted_classes))
probabilities = list(classifier.predict_proba(input_fn=new_samples))
print("Predictions probability: ", probabilities)
probabs = np.array(probabilities)
训练和测试数据的准确率均小于 0.2
我尝试了更多的 epoch,改变了学习率、激活函数和优化器,但准确率并没有提高。
据我所知,我的网络严重欠拟合。在这种情况下,添加更多节点和层应该可以工作,但增加节点和层几乎不会增加准确度,它仍然低于 0.2
任何人都可以指出我的代码中的错误,如果有的话?