Hope is a dangerous thing, but I have it.


【语言特性】keras中TimeDistributed多输入问题

   最近实验中使用了层级attention机制,具体代码参考了textClassifier的代码,是用keras实现的,我直接迁移到tf2.0也是很方便。
   这个代码中,sentence-level到document-level是通过keras自带的TimeDistributed实现的。TimeDistributed是自动地将相同操作应用于不同的time_step,以达到不同time_step进行相同的计算,并权重共享。
   官网中参数介绍如下:

  • inputs: Input tensor.
  • training: Python boolean indicating whether the layer should behave in training mode or in inference mode. This argument is passed to the wrapped layer (only if the layer supports this argument).
  • mask: Binary tensor of shape (samples, timesteps) indicating whether a given timestep should be masked. This argument is passed to the wrapped layer (only if the layer supports this argument).
       其实关注的点只有两个:
  • 输入必须(至少)为3D。比如LSTM返回值为序列。
  • 输出将是3D。
       如果还有不清楚的可以参看When and How to use TimeDistributedDense,其中有介绍什么时候使用TimeDistributed。
       网上的介绍通常是以Dense或者LSTM为例,旨在介绍TimeDistributed的原理。但是我在使用TimeDistributed时,需要向TimeDistributed对应的layer中传入两个参数,但是总是会报如下的错误:
Traceback (most recent call last):
  File "D:/05 USSA/new/yelp2013/tf2/多层/UPNNModel.py", line 191, in <module>
    document_embedding = tf.keras.layers.TimeDistributed(sentEncoder)([review_input, user_id_3d])
  File "F:\Python36\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py", line 817, in __call__
    self._maybe_build(inputs)
  File "F:\Python36\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py", line 2141, in _maybe_build
    self.build(input_shapes)
  File "F:\Python36\lib\site-packages\tensorflow_core\python\keras\layers\wrappers.py", line 195, in build
    input_shape = tensor_shape.TensorShape(input_shape).as_list()
  File "F:\Python36\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 776, in __init__
    self._dims = [as_dimension(d) for d in dims_iter]
  File "F:\Python36\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 776, in <listcomp>
    self._dims = [as_dimension(d) for d in dims_iter]
  File "F:\Python36\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 718, in as_dimension
    return Dimension(value)
  File "F:\Python36\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 193, in __init__
    self._value = int(value)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'TensorShape'

   后来去网上查,发现好像是TimeDistributed参数传入的问题。后来在stack overflow上看到这个问题Keras TimeDistributed for multi-input case?。其中提到TimeDistributed仅接受一个输入,所以解决方法是将input合在一起,然后通过Lamda函数再分开。我实验了这个方法,确实是可行的。