【语言特性】keras中TimeDistributed多输入问题
最近实验中使用了层级attention机制,具体代码参考了textClassifier的代码,是用keras实现的,我直接迁移到tf2.0也是很方便。
这个代码中,sentence-level到document-level是通过keras自带的TimeDistributed实现的。TimeDistributed是自动地将相同操作应用于不同的time_step,以达到不同time_step进行相同的计算,并权重共享。
官网中参数介绍如下:
- inputs: Input tensor.
- training: Python boolean indicating whether the layer should behave in training mode or in inference mode. This argument is passed to the wrapped layer (only if the layer supports this argument).
- mask: Binary tensor of shape (samples, timesteps) indicating whether a given timestep should be masked. This argument is passed to the wrapped layer (only if the layer supports this argument).
其实关注的点只有两个: - 输入必须(至少)为3D。比如LSTM返回值为序列。
- 输出将是3D。
如果还有不清楚的可以参看When and How to use TimeDistributedDense,其中有介绍什么时候使用TimeDistributed。
网上的介绍通常是以Dense或者LSTM为例,旨在介绍TimeDistributed的原理。但是我在使用TimeDistributed时,需要向TimeDistributed对应的layer中传入两个参数,但是总是会报如下的错误:
Traceback (most recent call last):
File "D:/05 USSA/new/yelp2013/tf2/多层/UPNNModel.py", line 191, in <module>
document_embedding = tf.keras.layers.TimeDistributed(sentEncoder)([review_input, user_id_3d])
File "F:\Python36\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py", line 817, in __call__
self._maybe_build(inputs)
File "F:\Python36\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py", line 2141, in _maybe_build
self.build(input_shapes)
File "F:\Python36\lib\site-packages\tensorflow_core\python\keras\layers\wrappers.py", line 195, in build
input_shape = tensor_shape.TensorShape(input_shape).as_list()
File "F:\Python36\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 776, in __init__
self._dims = [as_dimension(d) for d in dims_iter]
File "F:\Python36\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 776, in <listcomp>
self._dims = [as_dimension(d) for d in dims_iter]
File "F:\Python36\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 718, in as_dimension
return Dimension(value)
File "F:\Python36\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 193, in __init__
self._value = int(value)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'TensorShape'
后来去网上查,发现好像是TimeDistributed参数传入的问题。后来在stack overflow上看到这个问题Keras TimeDistributed for multi-input case?。其中提到TimeDistributed仅接受一个输入,所以解决方法是将input合在一起,然后通过Lamda函数再分开。我实验了这个方法,确实是可行的。