Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using Bi-LSTM with clstm model #12

Open
saja1994 opened this issue Jan 10, 2019 · 0 comments
Open

Using Bi-LSTM with clstm model #12

saja1994 opened this issue Jan 10, 2019 · 0 comments

Comments

@saja1994
Copy link

Thank you for your effort. Please, I want to use Bi-LSTM with clstm model. But when I use it, the following error raised
`Traceback (most recent call last):
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1292, in _do_call
return fn(*args)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1277, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1367, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: seq_lens(24) > input.dims(1)
[[{{node bidirectional_rnn/bw/ReverseSequence}} = ReverseSequence[T=DT_FLOAT, Tlen=DT_INT32, batch_dim=0, seq_dim=1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](concat, _arg_sequence_length_0_4)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train.py", line 219, in
run_step(train_input, is_training=True)
File "train.py", line 198, in run_step
vars = sess.run(fetches, feed_dict)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 887, in run
run_metadata_ptr)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1110, in _run
feed_dict_tensor, options, run_metadata)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1286, in _do_run
run_metadata)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\client\session.py", line 1308, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: seq_lens(24) > input.dims(1)
[[{{node bidirectional_rnn/bw/ReverseSequence}} = ReverseSequence[T=DT_FLOAT, Tlen=DT_INT32, batch_dim=0, seq_dim=1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](concat, _arg_sequence_length_0_4)]]

Caused by op 'bidirectional_rnn/bw/ReverseSequence', defined at:
File "train.py", line 138, in
classifier = clstm_clf(FLAGS)
File "C:\Users\Saja\Desktop\TextClassification-master\TextClassification-master\clstm_classifier.py", line 133, in init
sequence_length=self.sequence_length)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\ops\rnn.py", line 466, in bidirectional_dynamic_rnn
inputs_reverse = nest.map_structure(_map_reverse, inputs)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\util\nest.py", line 347, in map_structure
structure[0], [func(*x) for x in entries])
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\util\nest.py", line 347, in
structure[0], [func(*x) for x in entries])
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\ops\rnn.py", line 464, in _map_reverse
batch_axis=batch_axis)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\ops\rnn.py", line 453, in _reverse
seq_axis=seq_axis, batch_axis=batch_axis)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\ops\array_ops.py", line 2645, in reverse_sequence
name=name)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 7984, in reverse_sequence
seq_dim=seq_dim, batch_dim=batch_dim, name=name)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\framework\ops.py", line 3272, in create_op
op_def=op_def)
File "M:\Anaconda\envs\py3\lib\site-packages\tensorflow\python\framework\ops.py", line 1768, in init
self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): seq_lens(24) > input.dims(1)
[[{{node bidirectional_rnn/bw/ReverseSequence}} = ReverseSequence[T=DT_FLOAT, Tlen=DT_INT32, batch_dim=0, seq_dim=1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](concat, _arg_sequence_length_0_4)]]`

I took the implementation of Bi-LSTM from your code in rnn_classifier model:

`fw_cell = tf.contrib.rnn.LSTMCell(self.hidden_size)
bw_cell = tf.contrib.rnn.LSTMCell(self.hidden_size)
# Add dropout to LSTM cell
fw_cell = tf.contrib.rnn.DropoutWrapper(fw_cell, output_keep_prob=self.keep_prob)
bw_cell = tf.contrib.rnn.DropoutWrapper(bw_cell, output_keep_prob=self.keep_prob)
# Stacked LSTMs
fw_cell = tf.contrib.rnn.MultiRNNCell([fw_cell]*self.num_layers, state_is_tuple=True)
bw_cell = tf.contrib.rnn.MultiRNNCell([bw_cell]*self.num_layers, state_is_tuple=True)

    self._initial_state_fw = fw_cell.zero_state(self.batch_size, dtype=tf.float32)
    self._initial_state_bw = bw_cell.zero_state(self.batch_size, dtype=tf.float32)
    with tf.name_scope('dynamic_rnn'):
        
        outputs, state, _ = tf.nn.static_bidirectional_rnn(
            fw_cell, 
            bw_cell,
            tf.unstack(tf.transpose(rnn_inputs, perm=[1, 0, 2])),
            initial_state_fw=self._initial_state_fw,
            initial_state_bw=self._initial_state_bw,
            sequence_length=self.sequence_length,
            #dtype=tf.float32,
            scope='BiLSTM'
            )
        #outputs = tf.reshape(outputs, [-1, self.hidden_size * 2])
        self.outputs = outputs
    
    out, state = tf.nn.bidirectional_dynamic_rnn(fw_cell,
                                                   bw_cell,
                                                   inputs=rnn_inputs,
                                                   initial_state_fw=self._initial_state_fw,
                                                   initial_state_bw=self._initial_state_bw,
                                                   sequence_length=self.sequence_length)

    state_fw = state[0]
    state_bw = state[1]
    output = tf.concat([state_fw[self.num_layers - 1].h, state_bw[self.num_layers - 1].h], 1)
    
    self.final_state=output
    
    # Softmax output layer
    with tf.name_scope('softmax'):

        softmax_w = tf.get_variable('softmax_w', shape=[2 * self.hidden_size, self.num_classes], dtype=tf.float32)
        softmax_b = tf.get_variable('softmax_b', shape=[self.num_classes], dtype=tf.float32)

        # L2 regularization for output layer
        self.l2_loss += tf.nn.l2_loss(softmax_w)
        self.l2_loss += tf.nn.l2_loss(softmax_b)

        # logits
        self.logits = tf.matmul(self.final_state, softmax_w) + softmax_b
        predictions = tf.nn.softmax(self.logits)
        self.predictions = tf.argmax(predictions, 1, name='predictions')`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant