我目前正在努力学习Sonnet
。
我的网络(不完整,问题基于此):
class Model(snt.AbstractModule):
def __init__(self, name="LSTMNetwork"):
super(Model, self).__init__(name=name)
with self._enter_variable_scope():
self.l1 = snt.LSTM(100)
self.l2 = snt.LSTM(100)
self.out = snt.LSTM(10)
def _build(self, inputs):
# 'inputs' is of shape (batch_size, input_length)
# I need it to be of shape (batch_size, sequence_length, input_length)
l1_state = self.l1.initialize_state(np.shape(inputs)[0]) # init with batch_size
l2_state = self.l2.initialize_state(np.shape(inputs)[0]) # init with batch_size
out_state = self.out.initialize_state(np.shape(inputs)[0])
l1_out, l1_state = self.l1(inputs, l1_state)
l1_out = tf.tanh(l1_out)
l2_out, l2_state = self.l2(l1_out, l2_state)
l2_out = tf.tanh(l2_out)
output, out_state = self.out(l2_out, out_state)
output = tf.sigmoid(output)
return output, out_state
在其他框架(例如 Keras)中,LSTM 输入的形式为(batch_size, sequence_length, input_length)
.
但是,Sonnet 文档指出 Sonnet 的 LSTM 的输入格式为(batch_size, input_length)
.
如何将它们用于顺序输入?
到目前为止,我已经尝试在内部使用 for 循环_build
,迭代每个时间步长,但这会给出看似随机的输出。
我在 Keras 中尝试过相同的架构,它运行时没有任何问题。
我正在以渴望模式执行,GradientTape
用于培训。