We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Did you see the '-inf' in your run ?
+-----------------------+----------------------------------------------------+ | Parameter | Value | +-----------------------+----------------------------------------------------+ | train data pattern | data/tinyshakespeare/tiny_shakespeare_train.bin | | val data pattern | data/tinyshakespeare/tiny_shakespeare_val.bin | | output log file | nullptr | | batch size B | 4 | | sequence length T | 512 | | learning rate | 0.000300 | | val_loss_every | 20 | | val_max_steps | 20 | | sample_every | 20 | | genT | 64 | +-----------------------+----------------------------------------------------+ | device | Intel(R) Arc(TM) A770 Graphics | +-----------------------+----------------------------------------------------+ | max_sequence_length T | 1024 | | vocab_size V | 50257 | | padded_vocab_size Vp | 50304 | | num_layers L | 12 | | num_heads NH | 12 | | channels C | 768 | | num_parameters | 124475904 | +-----------------------+----------------------------------------------------+ | train_num_batches | 149 | | val_num_batches | 16 | +-----------------------+----------------------------------------------------+ allocated 474 MiB for model parameters allocated 2277 MiB for activations val loss -inf allocated 474 MiB for parameter gradients allocated 78 MiB for activation gradients allocated 474 MiB for AdamW optimizer state m allocated 474 MiB for AdamW optimizer state v step 1/149: train loss -inf (1373.163524 ms, 1491 tok/s) step 2/149: train loss -inf (561.510626 ms, 3647 tok/s) step 3/149: train loss -inf (562.402664 ms, 3641 tok/s) step 4/149: train loss -inf (562.877144 ms, 3638 tok/s) step 5/149: train loss 3.270137 (562.379342 ms, 3641 tok/s) step 6/149: train loss -inf (562.222351 ms, 3642 tok/s) step 7/149: train loss -inf (563.365131 ms, 3635 tok/s) step 8/149: train loss -inf (562.343304 ms, 3641 tok/s) step 9/149: train loss -inf (561.130251 ms, 3649 tok/s) step 10/149: train loss 3.771136 (562.232482 ms, 3642 tok/s) step 11/149: train loss 3.410619 (562.445602 ms, 3641 tok/s) step 12/149: train loss -inf (562.126695 ms, 3643 tok/s) step 13/149: train loss -inf (561.149267 ms, 3649 tok/s) step 14/149: train loss -inf (562.618085 ms, 3640 tok/s) step 15/149: train loss 3.552519 (562.104564 ms, 3643 tok/s) step 16/149: train loss -inf (562.188434 ms, 3642 tok/s) step 17/149: train loss 3.505062 (561.892586 ms, 3644 tok/s) step 18/149: train loss 3.899063 (561.642640 ms, 3646 tok/s) step 19/149: train loss 3.790717 (563.634025 ms, 3633 tok/s) step 20/149: train loss 4.134653 (560.460468 ms, 3654 tok/s) val loss -inf generating: --- O, disorporate, Bering to arm of Trussell, and for private use take me, since you are these children and not these. Unto the wise he, the fool I misjudged him, set me here Yea. Letter from Faith A great prince I presume --- step 21/149: train loss 3.076251 (574.284356 ms, 3566 tok/s) step 22/149: train loss 4.044003 (560.395851 ms, 3654 tok/s) step 23/149: train loss 3.664719 (561.161018 ms, 3649 tok/s) step 24/149: train loss 3.619468 (560.915251 ms, 3651 tok/s) step 25/149: train loss 3.448017 (560.725216 ms, 3652 tok/s) step 26/149: train loss 3.467965 (562.267702 ms, 3642 tok/s) step 27/149: train loss -inf (560.669723 ms, 3652 tok/s) step 28/149: train loss 3.983095 (561.373767 ms, 3648 tok/s) step 29/149: train loss 3.626441 (561.407721 ms, 3647 tok/s) step 30/149: train loss 3.650180 (561.128670 ms, 3649 tok/s) step 31/149: train loss 4.230763 (561.467428 ms, 3647 tok/s) step 32/149: train loss 3.920545 (561.079357 ms, 3650 tok/s) step 33/149: train loss 3.523292 (561.587132 ms, 3646 tok/s) step 34/149: train loss 3.645729 (562.089952 ms, 3643 tok/s) step 35/149: train loss -inf (561.293155 ms, 3648 tok/s) step 36/149: train loss 3.296374 (562.754651 ms, 3639 tok/s) step 37/149: train loss 3.665959 (561.024074 ms, 3650 tok/s) step 38/149: train loss 3.581248 (561.042401 ms, 3650 tok/s) step 39/149: train loss -inf (561.285337 ms, 3648 tok/s) step 40/149: train loss 3.861797 (560.898981 ms, 3651 tok/s) val loss -inf
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Did you see the '-inf' in your run ?
The text was updated successfully, but these errors were encountered: