"Tensor had NaN values" when encountering improbable label values #12

obarnstedt · 2021-04-03T16:06:18Z

Hi and thanks for all the great work!
I just wanted to point to a potential problem we've encountered during training of DGP. With our dataset, we could successfully run the first 50k iterations of "DGP on labeled frames only", but then for "Running DGP" encountered

tensorflow.python.framework.errors_impl.InvalidArgumentError: Found Inf or NaN global norm. : Tensor had NaN values
[[{{node VerifyFinite/CheckNumerics}}]]

This occurs in line 818 of fitdgp.py:
[loss_eval, _] = sess.run([loss, train_op], feed_dict)
After some debugging, I could trace the error to labeled frames in which the labels were accidentally set out of the normal range (DLC deletes markers set at x=0, y=0, but here they were accidentally at x=1, y=4; normally, labels were x/y>200). After removing these improbable labels, training continued normally.
It's great that we now had a chance to clean our training dataset, but it would be better if there was a way for DGP to maybe just ignore such labels while giving a precise Warning message to alert the user. Otherwise, it's quite hard for the user to figure out where the actual problem is.
Thanks,
Oliver

The text was updated successfully, but these errors were encountered:

waq1129 · 2021-05-14T18:31:05Z

Thanks for sharing this finding! This is very helpful to improve the package.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"Tensor had NaN values" when encountering improbable label values #12

"Tensor had NaN values" when encountering improbable label values #12

obarnstedt commented Apr 3, 2021

waq1129 commented May 14, 2021

"Tensor had NaN values" when encountering improbable label values #12

"Tensor had NaN values" when encountering improbable label values #12

Comments

obarnstedt commented Apr 3, 2021

waq1129 commented May 14, 2021