Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mismatching dimensions when predicting with multitask models #26

Closed
ivyleavedtoadflax opened this issue Mar 31, 2020 · 11 comments · Fixed by #25
Closed

Mismatching dimensions when predicting with multitask models #26

ivyleavedtoadflax opened this issue Mar 31, 2020 · 11 comments · Fixed by #25
Assignees
Labels
bug Something isn't working

Comments

@ivyleavedtoadflax
Copy link
Contributor

This issue occurs in #25. When running predict via the split_parse command, the following error results:

(virtualenv)  $ python -m deep_reference_parser split_parse "Upson MA (2019). This is a reference. In a journal. 16(1) 1-23"
Using TensorFlow backend.
ℹ Using config file:
/home/matthew/Documents/wellcome/deep_reference_parser/deep_reference_parser/configs/2020.3.18_multitask.ini
ℹ Attempting to download model artefacts if they are not found locally
in models/multitask/2020.3.18_multitask/. This may take some time...
✔ Found models/multitask/2020.3.18_multitask/indices.pickle
✔ Found models/multitask/2020.3.18_multitask/weights.h5
✔ Found embeddings/2020.1.1-wellcome-embeddings-300.txt
Traceback (most recent call last):
  File "/home/matthew/Documents/wellcome/deep_reference_parser/build/virtualenv/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1607, in _create_c_op
    c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimension 0 in both shapes must be equal, but are 347 and 324. Shapes are [347,100] and [324,100]. for 'Assign' (op: 'Assign') with input shapes: [347,100], [324,100].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/matthew/.pyenv/versions/3.7.2/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/matthew/.pyenv/versions/3.7.2/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/matthew/Documents/wellcome/deep_reference_parser/deep_reference_parser/__main__.py", line 30, in <module>
    plac.call(commands[command], sys.argv[1:])
  File "/home/matthew/Documents/wellcome/deep_reference_parser/build/virtualenv/lib/python3.7/site-packages/plac_core.py", line 328, in call
    cmd, result = parser.consume(arglist)
  File "/home/matthew/Documents/wellcome/deep_reference_parser/build/virtualenv/lib/python3.7/site-packages/plac_core.py", line 207, in consume
    return cmd, self.func(*(args + varargs + extraopts), **kwargs)
  File "/home/matthew/Documents/wellcome/deep_reference_parser/deep_reference_parser/split_parse.py", line 198, in split_parse
    out = mt.split_parse(text, return_tokens=tokens, verbose=True)
  File "/home/matthew/Documents/wellcome/deep_reference_parser/deep_reference_parser/split_parse.py", line 116, in split_parse
    preds = self.drp.predict(tokens, load_weights=True)
  File "/home/matthew/Documents/wellcome/deep_reference_parser/deep_reference_parser/deep_reference_parser.py", line 1026, in predict
    self.load_weights()
  File "/home/matthew/Documents/wellcome/deep_reference_parser/deep_reference_parser/deep_reference_parser.py", line 997, in load_weights
    self.model, self.weights_path, include_optimizer=False
  File "/home/matthew/Documents/wellcome/deep_reference_parser/build/virtualenv/lib/python3.7/site-packages/keras_contrib/utils/save_load_utils.py", line 97, in load_all_weights
    saving.load_weights_from_hdf5_group(f['model_weights'], model.layers)
  File "/home/matthew/Documents/wellcome/deep_reference_parser/build/virtualenv/lib/python3.7/site-packages/keras/engine/saving.py", line 1199, in load_weights_from_hdf5_group
    K.batch_set_value(weight_value_tuples)
  File "/home/matthew/Documents/wellcome/deep_reference_parser/build/virtualenv/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py", line 2727, in batch_set_value
    assign_op = x.assign(assign_placeholder)
  File "/home/matthew/Documents/wellcome/deep_reference_parser/build/virtualenv/lib/python3.7/site-packages/tensorflow_core/python/ops/variables.py", line 2067, in assign
    self._variable, value, use_locking=use_locking, name=name)
  File "/home/matthew/Documents/wellcome/deep_reference_parser/build/virtualenv/lib/python3.7/site-packages/tensorflow_core/python/ops/state_ops.py", line 227, in assign
    validate_shape=validate_shape)
  File "/home/matthew/Documents/wellcome/deep_reference_parser/build/virtualenv/lib/python3.7/site-packages/tensorflow_core/python/ops/gen_state_ops.py", line 66, in assign
    use_locking=use_locking, name=name)
  File "/home/matthew/Documents/wellcome/deep_reference_parser/build/virtualenv/lib/python3.7/site-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
    op_def=op_def)
  File "/home/matthew/Documents/wellcome/deep_reference_parser/build/virtualenv/lib/python3.7/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/home/matthew/Documents/wellcome/deep_reference_parser/build/virtualenv/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op
    attrs, op_def, compute_device)
  File "/home/matthew/Documents/wellcome/deep_reference_parser/build/virtualenv/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
    op_def=op_def)
  File "/home/matthew/Documents/wellcome/deep_reference_parser/build/virtualenv/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1770, in __init__
    control_input_ops)
  File "/home/matthew/Documents/wellcome/deep_reference_parser/build/virtualenv/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1610, in _create_c_op
    raise ValueError(str(e))
ValueError: Dimension 0 in both shapes must be equal, but are 347 and 324. Shapes are [347,100] and [324,100]. for 'Assign' (op: 'Assign') with input shapes: [347,100], [324,100].
@ivyleavedtoadflax ivyleavedtoadflax added the bug Something isn't working label Mar 31, 2020
@ivyleavedtoadflax ivyleavedtoadflax changed the title Mismatching dimensions when predicting multitask models Mismatching dimensions when predicting with multitask models Mar 31, 2020
@ivyleavedtoadflax
Copy link
Contributor Author

Seems to be occurring in

, but i suspect that something has gone awry elsewhere.

@lizgzil
Copy link
Contributor

lizgzil commented Mar 31, 2020

interesting my error has a slightly different number ValueError: Dimension 0 in both shapes must be equal, but are 347 and 385. Shapes are [347,100] and [385,100]. for 'Assign' (op: 'Assign') with input shapes: [347,100], [385,100].

@lizgzil
Copy link
Contributor

lizgzil commented Apr 1, 2020

(This is more for my records than anything else) I ran this for debugging so it didn't take ages loading with the model artefacts:

import os
from keras_contrib.utils import save_load_utils
from deep_reference_parser.common import MULTITASK_CFG
from deep_reference_parser.model_utils import get_config
from deep_reference_parser.reference_utils import break_into_chunks
from deep_reference_parser.deep_reference_parser import DeepReferenceParser
import en_core_web_sm

text = 'Upson MA (2019). This is a reference. In a journal. 16(1) 1-23'

config_file=MULTITASK_CFG
cfg = get_config(config_file)
MAX_WORDS = int(cfg["data"]["line_limit"]) 
OUTPUT = cfg["build"]["output"]
PRETRAINED_EMBEDDING = cfg["build"]["pretrained_embedding"]
DROPOUT = float(cfg["build"]["dropout"])
LSTM_HIDDEN = int(cfg["build"]["lstm_hidden"])
WORD_EMBEDDING_SIZE = int(cfg["build"]["word_embedding_size"])
CHAR_EMBEDDING_SIZE = int(cfg["build"]["char_embedding_size"])
WORD_EMBEDDINGS = cfg["build"]["word_embeddings"]
OUTPUT_PATH = cfg["build"]["output_path"]

nlp = en_core_web_sm.load()
doc = nlp(text)
chunks = break_into_chunks(doc, max_words=MAX_WORDS)
tokens = [[token.text for token in chunk] for chunk in chunks]

drp = DeepReferenceParser(output_path=OUTPUT_PATH)
drp.load_data(OUTPUT_PATH)

drp.build_model(
            output=OUTPUT,
            word_embeddings=WORD_EMBEDDINGS,
            pretrained_embedding=PRETRAINED_EMBEDDING,
            dropout=DROPOUT,
            lstm_hidden=LSTM_HIDDEN,
            word_embedding_size=WORD_EMBEDDING_SIZE,
            char_embedding_size=CHAR_EMBEDDING_SIZE,
        )

drp.predict(tokens, load_weights=True)

## Fails here, but let's look into into predict:

weights_path = os.path.join(OUTPUT_PATH, "weights.h5")
save_load_utils.load_all_weights(
            drp.model, weights_path, include_optimizer=False)

## Same error here

@lizgzil
Copy link
Contributor

lizgzil commented Apr 1, 2020

Could it be to do with jgcbrouns's answer here "Yea so this is a problem with the classes file, model file and/or anchor file not matching. Make sure that the same classes.txt file (the file where per new line your classes are defined) matches during training and during inference (test). In my case I used 2 different classes.txt file. One file had 4 categories and the other one had only 1 class."

@ivyleavedtoadflax
Copy link
Contributor Author

Yes I suppose it is possible, and I was having some issues with an empty class creeping in, if you recall? Not sure where it would have occurred in the current logic though...

@ivyleavedtoadflax
Copy link
Contributor Author

note that this only occurs in the multitask scenario, so it's got to be something specific about it...

@lizgzil
Copy link
Contributor

lizgzil commented Apr 1, 2020

Is this what you expected? (i.e. the last length of 886443).

>>> POLICY_TRAIN
'data/multitask/2020.3.18_multitask_train.tsv'
>>> train_data = load_tsv(POLICY_TRAIN)
>>> [len(l) for l in train_data[0]] # same for [len(l) for l in train_data[1]]

[150, 81, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 88, 150, 150, 121, 150, 150, 150, 150, 150, 150, 150, 58, 150, 150, 150, 150, 150, 108, 1, 2, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 1, 1, 1, 1, 5, 150, 150, 150, 150, 81, 1, 2, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 1, 1, 1, 1, 32, 150, 150, 150, 150, 150, 54, 150, 150, 150, 150, 150, 150, 150, ... 89, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 127, 150, 150, 110, 886443]

(test and valid data don't have the last element >150)

@ivyleavedtoadflax
Copy link
Contributor Author

no that's not expected, these should all be 150 or less. The 886443 is the Rodrigues data. In the datalabs cleanup PR i made some changes in the 2020.3.19 recipe that should fix the Rodrigues data. The very short values are caused, I suspect, by prodigy_to_tsv respecting doc endings. If you remove the -d flag in the tsv_Makefile 2020.3.19 recipe, and run the 2020.3.19 model again, these inputs should all be 150.

@ivyleavedtoadflax
Copy link
Contributor Author

[150, 81, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 88, 150, 150, 121, 150, 150, 150, 150, 150, 150, 150, 58, 150, 150, 150, 150, 150, 108, 1, 2, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 1, 1, 1, 1, 5, 150, 150, 150, 150, 81, 1, 2, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 1, 1, 1, 1, 32, 150, 150, 150, 150, 150, 54, 150, 150, 150, 150, 150, 150, 150, ... 89, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 150, 127, 150, 150, 110, 886443]

This is interesting in fact because these values are sequence length (i.e. 150 = 150 tokens). That means that the final value will get truncated to 150 because the 2020.3.18 model had a line_length set at 150. Subsequent model runs which made better use of the Rodrigues data (like 2020.3.19 and 2020.3.20) by ensuring that it was cut into sequences of, say, 150. These models performed less well. This suggests to me that the Rodrigues data is making the model worse, not better...

@ivyleavedtoadflax
Copy link
Contributor Author

I'm going to have a play with #28 over the weekend. If it works out it may also fix this issue.

@ivyleavedtoadflax
Copy link
Contributor Author

I'm going to have a play with #28 over the weekend. If it works out it may also fix this issue.

So it's not going to fix anything anytime soon. But I hope that the CRF layer will be included in tensorflow addons soon, and then we will be able to update the model to use tf 2.0. In the meantime this problem persists.

@ivyleavedtoadflax ivyleavedtoadflax linked a pull request Apr 12, 2020 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants