forked from CGCL-codes/naturalcc
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathpython.log
2329 lines (2329 loc) · 286 KB
/
python.log
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
nohup: ignoring input
Using backend: pytorch
[2021-03-22 13:55:00] INFO >> Load arguments in /home/wanyao/yang/naturalcc-dev/run/summarization/transformer/config/python_wan/python.yml (train.py:313, cli_main())
[2021-03-22 13:55:00] INFO >> {'criterion': 'be_cross_entropy', 'optimizer': 'torch_adam', 'lr_scheduler': 'fixed', 'tokenizer': None, 'bpe': None, 'common': {'no_progress_bar': 0, 'log_interval': 400, 'log_format': 'simple', 'tensorboard_logdir': '', 'memory_efficient_fp16': 1, 'fp16_no_flatten_grads': 1, 'fp16_init_scale': 128, 'fp16_scale_window': None, 'fp16_scale_tolerance': 0.0, 'min_loss_scale': 0.0001, 'threshold_loss_scale': None, 'empty_cache_freq': 0, 'task': 'be_summarization', 'seed': 1, 'cpu': 0, 'fp16': 0, 'fp16_opt_level': '01', 'bf16': 0, 'memory_efficient_bf16': 0, 'server_ip': '', 'server_port': ''}, 'dataset': {'num_workers': 3, 'skip_invalid_size_inputs_valid_test': 1, 'max_tokens': None, 'max_sentences': 64, 'required_batch_size_multiple': 1, 'dataset_impl': 'mmap', 'train_subset': 'train', 'valid_subset': 'valid', 'validate_interval': 1, 'fixed_validation_seed': None, 'disable_validation': 0, 'max_tokens_valid': None, 'max_sentences_valid': 1024, 'curriculum': 0, 'gen_subset': 'test', 'num_shards': 1, 'shard_id': 0}, 'distributed_training': {'distributed_world_size': 1, 'distributed_rank': 0, 'distributed_backend': 'nccl', 'distributed_init_method': None, 'distributed_port': -1, 'device_id': 0, 'pipeline_model_parallel': 0, 'distributed_no_spawn': 0, 'ddp_backend': 'c10d', 'bucket_cap_mb': 25, 'fix_batches_to_gpus': None, 'find_unused_parameters': 0, 'fast_stat_sync': 0, 'broadcast_buffers': 0, 'global_sync_iter': 50, 'warmup_iterations': 500, 'local_rank': -1, 'block_momentum': 0.875, 'block_lr': 1, 'use_nbm': 0, 'average_sync': 0}, 'task': {'data': '/mnt/wanyao/.ncc/python_wan/summarization/data-mmap', 'source_lang': 'code_tokens', 'target_lang': 'docstring_tokens', 'load_alignments': 0, 'left_pad_source': 0, 'left_pad_target': 0, 'max_source_positions': 400, 'max_target_positions': 32, 'upsample_primary': 1, 'truncate_source': 1, 'truncate_target': 1, 'append_eos_to_target': 1, 'eval_bleu': 1, 'eval_bleu_detok': 'space', 'eval_bleu_detok_args': None, 'eval_tokenized_bleu': 0, 'eval_bleu_remove_bpe': None, 'eval_bleu_args': None, 'eval_bleu_print_samples': 0}, 'model': {'arch': 'transformer_summarization', 'offset_positions_by_padding': 1, 'pooler_dropout': 0.2, 'activation_fn': 'relu', 'dropout': 0.2, 'attention_dropout': 0.2, 'activation_dropout': 0.2, 'relu_dropout': 0.2, 'encoder_positional_embeddings': 0, 'encoder_learned_pos': 1, 'encoder_max_relative_len': 0, 'encoder_embed_path': 0, 'encoder_embed_dim': 512, 'encoder_ffn_embed_dim': 2048, 'encoder_layers': 6, 'encoder_attention_heads': 8, 'encoder_normalize_before': 1, 'decoder_embed_path': '', 'decoder_positional_embeddings': 0, 'decoder_learned_pos': 1, 'decoder_max_relative_len': 0, 'decoder_embed_dim': 512, 'decoder_output_dim': 512, 'decoder_input_dim': 512, 'decoder_ffn_embed_dim': 2048, 'decoder_layers': 6, 'decoder_attention_heads': 8, 'decoder_normalize_before': 0, 'no_decoder_final_norm': 0, 'adaptive_softmax_cutoff': None, 'adaptive_softmax_dropout': 0.2, 'adaptive_softmax_factor': 0.0, 'share_decoder_input_output_embed': 1, 'decoder_out_embed_bias': 1, 'share_all_embeddings': 0, 'adaptive_input': 0, 'adaptive_input_factor': 0.0, 'adaptive_input_cutoff': None, 'tie_adaptive_weights': 0, 'tie_adaptive_proj': 0, 'no_cross_attention': 0, 'cross_self_attention': 0, 'layer_wise_attention': 0, 'encoder_layerdrop': 0.0, 'decoder_layerdrop': 0.0, 'encoder_layers_to_keep': None, 'decoder_layers_to_keep': None, 'layernorm_embedding': 0, 'no_scale_embedding': 1, 'encoder_dropout_in': 0.2, 'encoder_dropout_out': 0.2, 'decoder_dropout_in': 0.2, 'decoder_dropout_out': 0.2, 'max_source_positions': 400, 'max_target_positions': 32, 'multihead_attention_version': 'ncc', 'encoder_position_encoding_version': 'ncc_sinusoidal', 'decoder_position_encoding_version': 'ncc_sinusoidal'}, 'optimization': {'max_epoch': 200, 'max_update': 0, 'clip_norm': 5, 'update_freq': [1], 'lrs': [0.0001], 'min_lr': -1, 'use_bmuf': 1, 'force_anneal': 0, 'warmup_updates': 0, 'end_learning_rate': 0.0, 'power': 1.0, 'total_num_update': 1000000, 'sentence_avg': 0, 'adam': {'adam_betas': '(0.9, 0.999)', 'adam_eps': 1e-08, 'weight_decay': 0.0, 'use_old_adam': 0}, 'lr_shrink': 0.99}, 'checkpoint': {'restore_file': 'checkpoint_last.pt', 'reset_dataloader': None, 'reset_lr_scheduler': None, 'reset_meters': None, 'reset_optimizer': None, 'optimizer_overrides': '{}', 'save_interval': 1, 'save_interval_updates': 0, 'keep_interval_updates': 0, 'keep_last_epochs': -1, 'keep_best_checkpoints': -1, 'no_save': 0, 'no_epoch_checkpoints': 1, 'no_last_checkpoints': 0, 'no_save_optimizer_state': None, 'best_checkpoint_metric': 'bleu', 'maximize_best_checkpoint_metric': 1, 'patience': -1, 'save_dir': '/mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints', 'should_continue': 0, 'model_name_or_path': None, 'cache_dir': None, 'logging_steps': 500, 'save_steps': 2000, 'save_total_limit': 2, 'overwrite_output_dir': 0, 'overwrite_cache': 0}, 'eval': {'path': '/mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt', 'remove_bpe': None, 'quiet': 1, 'results_path': None, 'model_overrides': '{}', 'max_sentences': 512, 'beam': 1, 'nbest': 1, 'max_len_a': 0, 'max_len_b': 30, 'min_len': 1, 'match_source_len': 0, 'no_early_stop': 1, 'unnormalized': 0, 'no_beamable_mm': 0, 'lenpen': 1, 'unkpen': 0, 'replace_unk': None, 'sacrebleu': 0, 'score_reference': 0, 'prefix_size': 0, 'no_repeat_ngram_size': 0, 'sampling': 0, 'sampling_topk': -1, 'sampling_topp': -1, 'temperature': 1.0, 'diverse_beam_groups': -1, 'diverse_beam_strength': 0.5, 'diversity_rate': -1.0, 'print_alignment': 0, 'print_step': 0, 'iter_decode_eos_penalty': 0.0, 'iter_decode_max_iter': 10, 'iter_decode_force_max_iter': 0, 'iter_decode_with_beam': 1, 'iter_decode_with_external_reranker': 0, 'retain_iter_history': 0, 'decoding_format': None, 'nltk_bleu': 1, 'rouge': 1}} (train.py:315, cli_main())
[2021-03-22 13:55:00] INFO >> single GPU training... (train.py:344, cli_main())
[2021-03-22 13:55:01] INFO >> [code_tokens] dictionary: 50000 types (be_summarization.py:137, setup_task())
[2021-03-22 13:55:01] INFO >> [docstring_tokens] dictionary: 30000 types (be_summarization.py:138, setup_task())
[2021-03-22 13:55:01] INFO >> truncate valid.code_tokens to 400 (be_summarization.py:72, load_langpair_dataset())
[2021-03-22 13:55:01] INFO >> truncate valid.docstring_tokens to 30 (be_summarization.py:80, load_langpair_dataset())
[2021-03-22 13:55:01] INFO >> loaded 18505 examples from: /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/valid.code_tokens (be_summarization.py:89, load_langpair_dataset())
[2021-03-22 13:55:01] INFO >> loaded 18505 examples from: /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/valid.docstring_tokens (be_summarization.py:90, load_langpair_dataset())
[2021-03-22 13:55:02] INFO >> TransformerModel(
(encoder): TransformerEncoder(
(embed_tokens): Embedding(50000, 512, padding_idx=0)
(embed_positions): SinusoidalPositionalEmbedding(401, 512, padding_idx=0)
(layers): ModuleList(
(0): TransformerEncoderLayer(
(self_attn): MultiheadAttention(
(k_proj): Linear(in_features=512, out_features=512, bias=True)
(v_proj): Linear(in_features=512, out_features=512, bias=True)
(q_proj): Linear(in_features=512, out_features=512, bias=True)
(out_proj): Linear(in_features=512, out_features=512, bias=True)
)
(self_attn_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=512, out_features=2048, bias=True)
(fc2): Linear(in_features=2048, out_features=512, bias=True)
(final_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
)
(1): TransformerEncoderLayer(
(self_attn): MultiheadAttention(
(k_proj): Linear(in_features=512, out_features=512, bias=True)
(v_proj): Linear(in_features=512, out_features=512, bias=True)
(q_proj): Linear(in_features=512, out_features=512, bias=True)
(out_proj): Linear(in_features=512, out_features=512, bias=True)
)
(self_attn_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=512, out_features=2048, bias=True)
(fc2): Linear(in_features=2048, out_features=512, bias=True)
(final_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
)
(2): TransformerEncoderLayer(
(self_attn): MultiheadAttention(
(k_proj): Linear(in_features=512, out_features=512, bias=True)
(v_proj): Linear(in_features=512, out_features=512, bias=True)
(q_proj): Linear(in_features=512, out_features=512, bias=True)
(out_proj): Linear(in_features=512, out_features=512, bias=True)
)
(self_attn_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=512, out_features=2048, bias=True)
(fc2): Linear(in_features=2048, out_features=512, bias=True)
(final_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
)
(3): TransformerEncoderLayer(
(self_attn): MultiheadAttention(
(k_proj): Linear(in_features=512, out_features=512, bias=True)
(v_proj): Linear(in_features=512, out_features=512, bias=True)
(q_proj): Linear(in_features=512, out_features=512, bias=True)
(out_proj): Linear(in_features=512, out_features=512, bias=True)
)
(self_attn_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=512, out_features=2048, bias=True)
(fc2): Linear(in_features=2048, out_features=512, bias=True)
(final_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
)
(4): TransformerEncoderLayer(
(self_attn): MultiheadAttention(
(k_proj): Linear(in_features=512, out_features=512, bias=True)
(v_proj): Linear(in_features=512, out_features=512, bias=True)
(q_proj): Linear(in_features=512, out_features=512, bias=True)
(out_proj): Linear(in_features=512, out_features=512, bias=True)
)
(self_attn_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=512, out_features=2048, bias=True)
(fc2): Linear(in_features=2048, out_features=512, bias=True)
(final_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
)
(5): TransformerEncoderLayer(
(self_attn): MultiheadAttention(
(k_proj): Linear(in_features=512, out_features=512, bias=True)
(v_proj): Linear(in_features=512, out_features=512, bias=True)
(q_proj): Linear(in_features=512, out_features=512, bias=True)
(out_proj): Linear(in_features=512, out_features=512, bias=True)
)
(self_attn_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=512, out_features=2048, bias=True)
(fc2): Linear(in_features=2048, out_features=512, bias=True)
(final_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
)
)
(layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
)
(decoder): TransformerDecoder(
(embed_tokens): Embedding(30000, 512, padding_idx=0)
(embed_positions): SinusoidalPositionalEmbedding(33, 512, padding_idx=0)
(layers): ModuleList(
(0): TransformerDecoderLayer(
(self_attn): MultiheadAttention(
(k_proj): Linear(in_features=512, out_features=512, bias=True)
(v_proj): Linear(in_features=512, out_features=512, bias=True)
(q_proj): Linear(in_features=512, out_features=512, bias=True)
(out_proj): Linear(in_features=512, out_features=512, bias=True)
)
(self_attn_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(encoder_attn): MultiheadAttention(
(k_proj): Linear(in_features=512, out_features=512, bias=True)
(v_proj): Linear(in_features=512, out_features=512, bias=True)
(q_proj): Linear(in_features=512, out_features=512, bias=True)
(out_proj): Linear(in_features=512, out_features=512, bias=True)
)
(encoder_attn_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=512, out_features=2048, bias=True)
(fc2): Linear(in_features=2048, out_features=512, bias=True)
(final_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
)
(1): TransformerDecoderLayer(
(self_attn): MultiheadAttention(
(k_proj): Linear(in_features=512, out_features=512, bias=True)
(v_proj): Linear(in_features=512, out_features=512, bias=True)
(q_proj): Linear(in_features=512, out_features=512, bias=True)
(out_proj): Linear(in_features=512, out_features=512, bias=True)
)
(self_attn_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(encoder_attn): MultiheadAttention(
(k_proj): Linear(in_features=512, out_features=512, bias=True)
(v_proj): Linear(in_features=512, out_features=512, bias=True)
(q_proj): Linear(in_features=512, out_features=512, bias=True)
(out_proj): Linear(in_features=512, out_features=512, bias=True)
)
(encoder_attn_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=512, out_features=2048, bias=True)
(fc2): Linear(in_features=2048, out_features=512, bias=True)
(final_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
)
(2): TransformerDecoderLayer(
(self_attn): MultiheadAttention(
(k_proj): Linear(in_features=512, out_features=512, bias=True)
(v_proj): Linear(in_features=512, out_features=512, bias=True)
(q_proj): Linear(in_features=512, out_features=512, bias=True)
(out_proj): Linear(in_features=512, out_features=512, bias=True)
)
(self_attn_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(encoder_attn): MultiheadAttention(
(k_proj): Linear(in_features=512, out_features=512, bias=True)
(v_proj): Linear(in_features=512, out_features=512, bias=True)
(q_proj): Linear(in_features=512, out_features=512, bias=True)
(out_proj): Linear(in_features=512, out_features=512, bias=True)
)
(encoder_attn_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=512, out_features=2048, bias=True)
(fc2): Linear(in_features=2048, out_features=512, bias=True)
(final_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
)
(3): TransformerDecoderLayer(
(self_attn): MultiheadAttention(
(k_proj): Linear(in_features=512, out_features=512, bias=True)
(v_proj): Linear(in_features=512, out_features=512, bias=True)
(q_proj): Linear(in_features=512, out_features=512, bias=True)
(out_proj): Linear(in_features=512, out_features=512, bias=True)
)
(self_attn_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(encoder_attn): MultiheadAttention(
(k_proj): Linear(in_features=512, out_features=512, bias=True)
(v_proj): Linear(in_features=512, out_features=512, bias=True)
(q_proj): Linear(in_features=512, out_features=512, bias=True)
(out_proj): Linear(in_features=512, out_features=512, bias=True)
)
(encoder_attn_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=512, out_features=2048, bias=True)
(fc2): Linear(in_features=2048, out_features=512, bias=True)
(final_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
)
(4): TransformerDecoderLayer(
(self_attn): MultiheadAttention(
(k_proj): Linear(in_features=512, out_features=512, bias=True)
(v_proj): Linear(in_features=512, out_features=512, bias=True)
(q_proj): Linear(in_features=512, out_features=512, bias=True)
(out_proj): Linear(in_features=512, out_features=512, bias=True)
)
(self_attn_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(encoder_attn): MultiheadAttention(
(k_proj): Linear(in_features=512, out_features=512, bias=True)
(v_proj): Linear(in_features=512, out_features=512, bias=True)
(q_proj): Linear(in_features=512, out_features=512, bias=True)
(out_proj): Linear(in_features=512, out_features=512, bias=True)
)
(encoder_attn_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=512, out_features=2048, bias=True)
(fc2): Linear(in_features=2048, out_features=512, bias=True)
(final_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
)
(5): TransformerDecoderLayer(
(self_attn): MultiheadAttention(
(k_proj): Linear(in_features=512, out_features=512, bias=True)
(v_proj): Linear(in_features=512, out_features=512, bias=True)
(q_proj): Linear(in_features=512, out_features=512, bias=True)
(out_proj): Linear(in_features=512, out_features=512, bias=True)
)
(self_attn_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(encoder_attn): MultiheadAttention(
(k_proj): Linear(in_features=512, out_features=512, bias=True)
(v_proj): Linear(in_features=512, out_features=512, bias=True)
(q_proj): Linear(in_features=512, out_features=512, bias=True)
(out_proj): Linear(in_features=512, out_features=512, bias=True)
)
(encoder_attn_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=512, out_features=2048, bias=True)
(fc2): Linear(in_features=2048, out_features=512, bias=True)
(final_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
)
)
(out_generator): Linear(in_features=512, out_features=30000, bias=True)
)
) (train.py:234, single_main())
[2021-03-22 13:55:02] INFO >> model transformer_summarization, criterion BECrossEntropyCriterion (train.py:235, single_main())
[2021-03-22 13:55:02] INFO >> num. model params: 85129520 (num. trained: 85129520) (train.py:236, single_main())
[2021-03-22 13:55:07] INFO >> training on 1 GPUs (train.py:243, single_main())
[2021-03-22 13:55:07] INFO >> max tokens per GPU = None and max sentences per GPU = 64 (train.py:244, single_main())
[2021-03-22 13:55:07] INFO >> no existing checkpoint found /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_last.pt (ncc_trainer.py:269, load_checkpoint())
[2021-03-22 13:55:07] INFO >> loading train data for epoch 1 (ncc_trainer.py:283, get_train_iterator())
[2021-03-22 13:55:07] INFO >> truncate train.code_tokens to 400 (be_summarization.py:72, load_langpair_dataset())
[2021-03-22 13:55:07] INFO >> truncate train.docstring_tokens to 30 (be_summarization.py:80, load_langpair_dataset())
[2021-03-22 13:55:07] INFO >> loaded 55538 examples from: /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/train.code_tokens (be_summarization.py:89, load_langpair_dataset())
[2021-03-22 13:55:07] INFO >> loaded 55538 examples from: /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/train.docstring_tokens (be_summarization.py:90, load_langpair_dataset())
[2021-03-22 13:55:08] INFO >> NOTE: your device may support faster training with fp16 (ncc_trainer.py:154, _setup_optimizer())
/home/wanyao/yang/naturalcc-dev/ncc/utils/utils.py:574: UserWarning: amp_C fused kernels unavailable, disabling multi_tensor_l2norm; you may get better performance by installing NVIDIA's apex library
warnings.warn(
[2021-03-22 13:56:51] INFO >> epoch 001: 400 / 868 loss=46.098, bleu=0, ppl=7.53057e+13, wps=2782.8, ups=4.17, wpb=667.3, bsz=64, num_updates=400, lr=0.0001, gnorm=50.062, clip=100, train_wall=95, wall=103 (progress_bar.py:260, log())
[2021-03-22 13:58:28] INFO >> epoch 001: 800 / 868 loss=28.665, bleu=0, ppl=4.2555e+08, wps=2739.8, ups=4.11, wpb=666.4, bsz=64, num_updates=800, lr=0.0001, gnorm=42.08, clip=100, train_wall=96, wall=200 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 13:58:45] INFO >> epoch 001 | loss 36.362 | bleu 0 | ppl 8.83131e+10 | wps 2748.6 | ups 4.12 | wpb 666.6 | bsz 64 | num_updates 868 | lr 0.0001 | gnorm 45.917 | clip 100 | train_wall 207 | wall 218 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 14:00:17] INFO >> epoch 001 | valid on 'valid' subset | loss 20.097 | bleu 11.3704 | ppl 1.12142e+06 | wps 2261.4 | wpb 10165.6 | bsz 973.9 | num_updates 868 (progress_bar.py:269, print())
[2021-03-22 14:00:21] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 1 @ 868 updates, score 11.370375) (writing took 3.393730 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 14:01:48] INFO >> epoch 002: 332 / 868 loss=22.173, bleu=0, ppl=4.72967e+06, wps=1334.8, ups=2, wpb=667.1, bsz=64, num_updates=1200, lr=9.9e-05, gnorm=42.822, clip=100, train_wall=96, wall=400 (progress_bar.py:260, log())
[2021-03-22 14:03:25] INFO >> epoch 002: 732 / 868 loss=18.443, bleu=0, ppl=356299, wps=2732.9, ups=4.11, wpb=665.3, bsz=64, num_updates=1600, lr=9.9e-05, gnorm=40.038, clip=100, train_wall=96, wall=498 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 14:03:59] INFO >> epoch 002 | loss 19.41 | bleu 0 | ppl 696564 | wps 1841.3 | ups 2.76 | wpb 666.6 | bsz 64 | num_updates 1736 | lr 9.9e-05 | gnorm 40.862 | clip 100 | train_wall 209 | wall 532 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 14:05:25] INFO >> epoch 002 | valid on 'valid' subset | loss 14.306 | bleu 15.5304 | ppl 20260.5 | wps 2448.1 | wpb 10165.6 | bsz 973.9 | num_updates 1736 | best_bleu 15.5304 (progress_bar.py:269, print())
[2021-03-22 14:06:18] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 2 @ 1736 updates, score 15.530422) (writing took 52.652943 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 14:07:28] INFO >> epoch 003: 264 / 868 loss=15.833, bleu=0, ppl=58377.2, wps=1101.6, ups=1.65, wpb=667.6, bsz=64, num_updates=2000, lr=9.8e-05, gnorm=38.722, clip=100, train_wall=96, wall=740 (progress_bar.py:260, log())
[2021-03-22 14:09:03] INFO >> epoch 003: 664 / 868 loss=14.095, bleu=0, ppl=17498.6, wps=2778.8, ups=4.17, wpb=665.7, bsz=64, num_updates=2400, lr=9.8e-05, gnorm=36.541, clip=100, train_wall=94, wall=836 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 14:09:54] INFO >> epoch 003 | loss 14.266 | bleu 0 | ppl 19705.1 | wps 1631.9 | ups 2.45 | wpb 666.6 | bsz 64 | num_updates 2604 | lr 9.8e-05 | gnorm 36.788 | clip 100 | train_wall 206 | wall 887 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 14:11:20] INFO >> epoch 003 | valid on 'valid' subset | loss 11.299 | bleu 15.4659 | ppl 2519.19 | wps 2422.6 | wpb 10165.6 | bsz 973.9 | num_updates 2604 | best_bleu 15.5304 (progress_bar.py:269, print())
[2021-03-22 14:11:48] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_last.pt (epoch 3 @ 2604 updates, score 15.465893) (writing took 27.859430 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 14:12:43] INFO >> epoch 004: 196 / 868 loss=12.759, bleu=0, ppl=6932.8, wps=1217.3, ups=1.82, wpb=667.3, bsz=64, num_updates=2800, lr=9.7e-05, gnorm=34.753, clip=100, train_wall=97, wall=1055 (progress_bar.py:260, log())
[2021-03-22 14:14:19] INFO >> epoch 004: 596 / 868 loss=11.686, bleu=0, ppl=3295.44, wps=2775.4, ups=4.17, wpb=665.5, bsz=64, num_updates=3200, lr=9.7e-05, gnorm=33.571, clip=100, train_wall=94, wall=1151 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 14:15:26] INFO >> epoch 004 | loss 11.657 | bleu 0 | ppl 3228.17 | wps 1743.5 | ups 2.62 | wpb 666.6 | bsz 64 | num_updates 3472 | lr 9.7e-05 | gnorm 33.111 | clip 100 | train_wall 207 | wall 1218 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 14:16:52] INFO >> epoch 004 | valid on 'valid' subset | loss 9.647 | bleu 15.0834 | ppl 801.55 | wps 2429.3 | wpb 10165.6 | bsz 973.9 | num_updates 3472 | best_bleu 15.5304 (progress_bar.py:269, print())
[2021-03-22 14:17:20] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_last.pt (epoch 4 @ 3472 updates, score 15.08336) (writing took 27.504957 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 14:17:58] INFO >> epoch 005: 128 / 868 loss=10.935, bleu=0, ppl=1958.14, wps=1221.2, ups=1.83, wpb=668.4, bsz=64, num_updates=3600, lr=9.6e-05, gnorm=31.234, clip=100, train_wall=96, wall=1370 (progress_bar.py:260, log())
[2021-03-22 14:19:34] INFO >> epoch 005: 528 / 868 loss=10.21, bleu=0, ppl=1184.64, wps=2763.8, ups=4.15, wpb=665.3, bsz=64, num_updates=4000, lr=9.6e-05, gnorm=30.465, clip=100, train_wall=95, wall=1466 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 14:20:58] INFO >> epoch 005 | loss 10.119 | bleu 0 | ppl 1111.79 | wps 1743.6 | ups 2.62 | wpb 666.6 | bsz 64 | num_updates 4340 | lr 9.6e-05 | gnorm 29.835 | clip 100 | train_wall 208 | wall 1550 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 14:22:24] INFO >> epoch 005 | valid on 'valid' subset | loss 8.883 | bleu 15.2031 | ppl 472.16 | wps 2437.7 | wpb 10165.6 | bsz 973.9 | num_updates 4340 | best_bleu 15.5304 (progress_bar.py:269, print())
[2021-03-22 14:22:50] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_last.pt (epoch 5 @ 4340 updates, score 15.203093) (writing took 25.984781 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 14:23:11] INFO >> epoch 006: 60 / 868 loss=9.732, bleu=0, ppl=850.67, wps=1230.3, ups=1.85, wpb=666.6, bsz=64, num_updates=4400, lr=9.5e-05, gnorm=28.679, clip=100, train_wall=96, wall=1683 (progress_bar.py:260, log())
[2021-03-22 14:24:46] INFO >> epoch 006: 460 / 868 loss=9.281, bleu=0, ppl=621.94, wps=2799.4, ups=4.19, wpb=667.8, bsz=64, num_updates=4800, lr=9.5e-05, gnorm=27.539, clip=100, train_wall=94, wall=1779 (progress_bar.py:260, log())
[2021-03-22 14:26:24] INFO >> epoch 006: 860 / 868 loss=8.927, bleu=0, ppl=486.65, wps=2704.8, ups=4.06, wpb=665.7, bsz=64, num_updates=5200, lr=9.5e-05, gnorm=26.631, clip=100, train_wall=97, wall=1877 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 14:26:27] INFO >> epoch 006 | loss 9.129 | bleu 0 | ppl 559.72 | wps 1757.2 | ups 2.64 | wpb 666.6 | bsz 64 | num_updates 5208 | lr 9.5e-05 | gnorm 27.151 | clip 100 | train_wall 207 | wall 1880 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 14:27:54] INFO >> epoch 006 | valid on 'valid' subset | loss 8.301 | bleu 15.3611 | ppl 315.32 | wps 2417.2 | wpb 10165.6 | bsz 973.9 | num_updates 5208 | best_bleu 15.5304 (progress_bar.py:269, print())
[2021-03-22 14:28:22] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_last.pt (epoch 6 @ 5208 updates, score 15.361138) (writing took 27.752488 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 14:30:03] INFO >> epoch 007: 392 / 868 loss=8.594, bleu=0, ppl=386.5, wps=1218.2, ups=1.83, wpb=667.1, bsz=64, num_updates=5600, lr=9.4e-05, gnorm=25.197, clip=100, train_wall=95, wall=2096 (progress_bar.py:260, log())
[2021-03-22 14:31:40] INFO >> epoch 007: 792 / 868 loss=8.345, bleu=0, ppl=325.13, wps=2770.6, ups=4.16, wpb=666, bsz=64, num_updates=6000, lr=9.4e-05, gnorm=24.404, clip=100, train_wall=95, wall=2192 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 14:31:59] INFO >> epoch 007 | loss 8.442 | bleu 0 | ppl 347.67 | wps 1742.2 | ups 2.61 | wpb 666.6 | bsz 64 | num_updates 6076 | lr 9.4e-05 | gnorm 24.726 | clip 100 | train_wall 206 | wall 2212 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 14:33:23] INFO >> epoch 007 | valid on 'valid' subset | loss 7.996 | bleu 16.6458 | ppl 255.25 | wps 2506.4 | wpb 10165.6 | bsz 973.9 | num_updates 6076 | best_bleu 16.6458 (progress_bar.py:269, print())
[2021-03-22 14:34:16] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 7 @ 6076 updates, score 16.645835) (writing took 52.173543 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 14:35:41] INFO >> epoch 008: 324 / 868 loss=8.067, bleu=0, ppl=268.2, wps=1108.2, ups=1.66, wpb=667.5, bsz=64, num_updates=6400, lr=9.3e-05, gnorm=23.142, clip=100, train_wall=95, wall=2433 (progress_bar.py:260, log())
[2021-03-22 14:37:17] INFO >> epoch 008: 724 / 868 loss=7.893, bleu=0, ppl=237.77, wps=2755.3, ups=4.14, wpb=665.2, bsz=64, num_updates=6800, lr=9.3e-05, gnorm=22.288, clip=100, train_wall=95, wall=2530 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 14:37:53] INFO >> epoch 008 | loss 7.922 | bleu 0 | ppl 242.6 | wps 1635.2 | ups 2.45 | wpb 666.6 | bsz 64 | num_updates 6944 | lr 9.3e-05 | gnorm 22.414 | clip 100 | train_wall 207 | wall 2566 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 14:39:20] INFO >> epoch 008 | valid on 'valid' subset | loss 7.69 | bleu 15.7951 | ppl 206.55 | wps 2411.3 | wpb 10165.6 | bsz 973.9 | num_updates 6944 | best_bleu 16.6458 (progress_bar.py:269, print())
[2021-03-22 14:39:47] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_last.pt (epoch 8 @ 6944 updates, score 15.79508) (writing took 26.920045 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 14:40:56] INFO >> epoch 009: 256 / 868 loss=7.662, bleu=0, ppl=202.52, wps=1220.6, ups=1.83, wpb=668, bsz=64, num_updates=7200, lr=9.2e-05, gnorm=21.454, clip=100, train_wall=96, wall=2749 (progress_bar.py:260, log())
[2021-03-22 14:42:33] INFO >> epoch 009: 656 / 868 loss=7.503, bleu=0, ppl=181.38, wps=2751.7, ups=4.13, wpb=665.6, bsz=64, num_updates=7600, lr=9.2e-05, gnorm=20.595, clip=100, train_wall=95, wall=2845 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 14:43:26] INFO >> epoch 009 | loss 7.51 | bleu 0 | ppl 182.34 | wps 1739.1 | ups 2.61 | wpb 666.6 | bsz 64 | num_updates 7812 | lr 9.2e-05 | gnorm 20.683 | clip 100 | train_wall 208 | wall 2898 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 14:44:54] INFO >> epoch 009 | valid on 'valid' subset | loss 7.464 | bleu 15.9752 | ppl 176.5 | wps 2381.5 | wpb 10165.6 | bsz 973.9 | num_updates 7812 | best_bleu 16.6458 (progress_bar.py:269, print())
[2021-03-22 14:45:27] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_last.pt (epoch 9 @ 7812 updates, score 15.975205) (writing took 33.262008 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 14:46:19] INFO >> epoch 010: 188 / 868 loss=7.36, bleu=0, ppl=164.32, wps=1177.7, ups=1.77, wpb=667.2, bsz=64, num_updates=8000, lr=9.1e-05, gnorm=19.852, clip=100, train_wall=96, wall=3072 (progress_bar.py:260, log())
[2021-03-22 14:47:55] INFO >> epoch 010: 588 / 868 loss=7.154, bleu=0, ppl=142.44, wps=2771, ups=4.16, wpb=665.5, bsz=64, num_updates=8400, lr=9.1e-05, gnorm=19.105, clip=100, train_wall=95, wall=3168 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 14:49:05] INFO >> epoch 010 | loss 7.168 | bleu 0 | ppl 143.83 | wps 1706.1 | ups 2.56 | wpb 666.6 | bsz 64 | num_updates 8680 | lr 9.1e-05 | gnorm 19.02 | clip 100 | train_wall 207 | wall 3237 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 14:50:34] INFO >> epoch 010 | valid on 'valid' subset | loss 7.281 | bleu 16.3654 | ppl 155.5 | wps 2343.3 | wpb 10165.6 | bsz 973.9 | num_updates 8680 | best_bleu 16.6458 (progress_bar.py:269, print())
[2021-03-22 14:51:15] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_last.pt (epoch 10 @ 8680 updates, score 16.365427) (writing took 41.173797 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 14:51:50] INFO >> epoch 011: 120 / 868 loss=7.073, bleu=0, ppl=134.64, wps=1137.6, ups=1.7, wpb=667.7, bsz=64, num_updates=8800, lr=9e-05, gnorm=18.337, clip=100, train_wall=96, wall=3403 (progress_bar.py:260, log())
[2021-03-22 14:53:26] INFO >> epoch 011: 520 / 868 loss=6.886, bleu=0, ppl=118.26, wps=2767.8, ups=4.16, wpb=665.9, bsz=64, num_updates=9200, lr=9e-05, gnorm=17.702, clip=100, train_wall=95, wall=3499 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 14:54:53] INFO >> epoch 011 | loss 6.88 | bleu 0 | ppl 117.76 | wps 1663.2 | ups 2.5 | wpb 666.6 | bsz 64 | num_updates 9548 | lr 9e-05 | gnorm 17.472 | clip 100 | train_wall 208 | wall 3585 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 14:56:20] INFO >> epoch 011 | valid on 'valid' subset | loss 7.155 | bleu 16.75 | ppl 142.51 | wps 2399.4 | wpb 10165.6 | bsz 973.9 | num_updates 9548 | best_bleu 16.75 (progress_bar.py:269, print())
[2021-03-22 14:57:12] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 11 @ 9548 updates, score 16.750014) (writing took 52.322836 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 14:57:32] INFO >> epoch 012: 52 / 868 loss=6.806, bleu=0, ppl=111.91, wps=1087.1, ups=1.63, wpb=666.8, bsz=64, num_updates=9600, lr=9e-05, gnorm=16.979, clip=100, train_wall=97, wall=3744 (progress_bar.py:260, log())
[2021-03-22 14:59:08] INFO >> epoch 012: 452 / 868 loss=6.669, bleu=0, ppl=101.75, wps=2781.1, ups=4.16, wpb=668.1, bsz=64, num_updates=10000, lr=9e-05, gnorm=16.608, clip=100, train_wall=95, wall=3840 (progress_bar.py:260, log())
[2021-03-22 15:00:46] INFO >> epoch 012: 852 / 868 loss=6.568, bleu=0, ppl=94.88, wps=2718.9, ups=4.09, wpb=665.3, bsz=64, num_updates=10400, lr=9e-05, gnorm=15.894, clip=100, train_wall=96, wall=3938 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 15:00:50] INFO >> epoch 012 | loss 6.627 | bleu 0 | ppl 98.82 | wps 1618 | ups 2.43 | wpb 666.6 | bsz 64 | num_updates 10416 | lr 9e-05 | gnorm 16.261 | clip 100 | train_wall 207 | wall 3943 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 15:02:17] INFO >> epoch 012 | valid on 'valid' subset | loss 7.021 | bleu 17.397 | ppl 129.85 | wps 2418.2 | wpb 10165.6 | bsz 973.9 | num_updates 10416 | best_bleu 17.397 (progress_bar.py:269, print())
[2021-03-22 15:03:36] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 12 @ 10416 updates, score 17.397037) (writing took 79.455603 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 15:05:16] INFO >> epoch 013: 384 / 868 loss=6.463, bleu=0, ppl=88.23, wps=988.8, ups=1.48, wpb=667.3, bsz=64, num_updates=10800, lr=8.9e-05, gnorm=15.515, clip=100, train_wall=95, wall=4208 (progress_bar.py:260, log())
[2021-03-22 15:06:53] INFO >> epoch 013: 784 / 868 loss=6.375, bleu=0, ppl=83, wps=2743.5, ups=4.12, wpb=665.3, bsz=64, num_updates=11200, lr=8.9e-05, gnorm=14.727, clip=100, train_wall=96, wall=4305 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 15:07:14] INFO >> epoch 013 | loss 6.405 | bleu 0 | ppl 84.77 | wps 1506.7 | ups 2.26 | wpb 666.6 | bsz 64 | num_updates 11284 | lr 8.9e-05 | gnorm 15.041 | clip 100 | train_wall 207 | wall 4327 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 15:08:42] INFO >> epoch 013 | valid on 'valid' subset | loss 6.88 | bleu 17.5086 | ppl 117.8 | wps 2400.3 | wpb 10165.6 | bsz 973.9 | num_updates 11284 | best_bleu 17.5086 (progress_bar.py:269, print())
[2021-03-22 15:09:34] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 13 @ 11284 updates, score 17.508591) (writing took 52.717968 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 15:10:57] INFO >> epoch 014: 316 / 868 loss=6.261, bleu=0, ppl=76.72, wps=1093.3, ups=1.64, wpb=667.8, bsz=64, num_updates=11600, lr=8.8e-05, gnorm=14.344, clip=100, train_wall=96, wall=4550 (progress_bar.py:260, log())
[2021-03-22 15:12:34] INFO >> epoch 014: 716 / 868 loss=6.198, bleu=0, ppl=73.41, wps=2740.6, ups=4.12, wpb=665.4, bsz=64, num_updates=12000, lr=8.8e-05, gnorm=13.807, clip=100, train_wall=96, wall=4647 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 15:13:12] INFO >> epoch 014 | loss 6.205 | bleu 0 | ppl 73.78 | wps 1618.9 | ups 2.43 | wpb 666.6 | bsz 64 | num_updates 12152 | lr 8.8e-05 | gnorm 13.961 | clip 100 | train_wall 207 | wall 4684 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 15:14:39] INFO >> epoch 014 | valid on 'valid' subset | loss 6.822 | bleu 17.6071 | ppl 113.15 | wps 2395.7 | wpb 10165.6 | bsz 973.9 | num_updates 12152 | best_bleu 17.6071 (progress_bar.py:269, print())
[2021-03-22 15:15:57] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 14 @ 12152 updates, score 17.607084) (writing took 78.293807 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 15:17:04] INFO >> epoch 015: 248 / 868 loss=6.1, bleu=0, ppl=68.6, wps=988.7, ups=1.48, wpb=668, bsz=64, num_updates=12400, lr=8.7e-05, gnorm=13.299, clip=100, train_wall=96, wall=4917 (progress_bar.py:260, log())
[2021-03-22 15:18:41] INFO >> epoch 015: 648 / 868 loss=6.016, bleu=0, ppl=64.7, wps=2758.8, ups=4.15, wpb=665.3, bsz=64, num_updates=12800, lr=8.7e-05, gnorm=13.01, clip=100, train_wall=95, wall=5013 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 15:19:35] INFO >> epoch 015 | loss 6.024 | bleu 0 | ppl 65.06 | wps 1508.1 | ups 2.26 | wpb 666.6 | bsz 64 | num_updates 13020 | lr 8.7e-05 | gnorm 12.951 | clip 100 | train_wall 207 | wall 5068 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 15:21:02] INFO >> epoch 015 | valid on 'valid' subset | loss 6.749 | bleu 18.4901 | ppl 107.54 | wps 2411.5 | wpb 10165.6 | bsz 973.9 | num_updates 13020 | best_bleu 18.4901 (progress_bar.py:269, print())
[2021-03-22 15:21:55] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 15 @ 13020 updates, score 18.490057) (writing took 52.906474 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 15:22:45] INFO >> epoch 016: 180 / 868 loss=5.966, bleu=0, ppl=62.52, wps=1092, ups=1.64, wpb=667.4, bsz=64, num_updates=13200, lr=8.6e-05, gnorm=12.484, clip=100, train_wall=96, wall=5258 (progress_bar.py:260, log())
[2021-03-22 15:24:22] INFO >> epoch 016: 580 / 868 loss=5.838, bleu=0, ppl=57.2, wps=2759.1, ups=4.14, wpb=665.8, bsz=64, num_updates=13600, lr=8.6e-05, gnorm=12.226, clip=100, train_wall=95, wall=5354 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 15:25:33] INFO >> epoch 016 | loss 5.855 | bleu 0 | ppl 57.9 | wps 1618.1 | ups 2.43 | wpb 666.6 | bsz 64 | num_updates 13888 | lr 8.6e-05 | gnorm 12.116 | clip 100 | train_wall 207 | wall 5426 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 15:26:59] INFO >> epoch 016 | valid on 'valid' subset | loss 6.69 | bleu 18.4697 | ppl 103.28 | wps 2430.5 | wpb 10165.6 | bsz 973.9 | num_updates 13888 | best_bleu 18.4901 (progress_bar.py:269, print())
[2021-03-22 15:27:42] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_last.pt (epoch 16 @ 13888 updates, score 18.469717) (writing took 42.928469 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 15:28:16] INFO >> epoch 017: 112 / 868 loss=5.812, bleu=0, ppl=56.18, wps=1139, ups=1.71, wpb=666.9, bsz=64, num_updates=14000, lr=8.5e-05, gnorm=11.77, clip=100, train_wall=96, wall=5589 (progress_bar.py:260, log())
[2021-03-22 15:29:52] INFO >> epoch 017: 512 / 868 loss=5.698, bleu=0, ppl=51.92, wps=2787.1, ups=4.18, wpb=666.5, bsz=64, num_updates=14400, lr=8.5e-05, gnorm=11.522, clip=100, train_wall=94, wall=5684 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 15:31:20] INFO >> epoch 017 | loss 5.696 | bleu 0 | ppl 51.85 | wps 1668.7 | ups 2.5 | wpb 666.6 | bsz 64 | num_updates 14756 | lr 8.5e-05 | gnorm 11.349 | clip 100 | train_wall 207 | wall 5772 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 15:32:47] INFO >> epoch 017 | valid on 'valid' subset | loss 6.64 | bleu 18.4262 | ppl 99.76 | wps 2393.7 | wpb 10165.6 | bsz 973.9 | num_updates 14756 | best_bleu 18.4901 (progress_bar.py:269, print())
[2021-03-22 15:33:15] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_last.pt (epoch 17 @ 14756 updates, score 18.426177) (writing took 28.034896 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 15:33:32] INFO >> epoch 018: 44 / 868 loss=5.656, bleu=0, ppl=50.41, wps=1208.8, ups=1.81, wpb=666.9, bsz=64, num_updates=14800, lr=8.4e-05, gnorm=11.076, clip=100, train_wall=97, wall=5905 (progress_bar.py:260, log())
[2021-03-22 15:35:09] INFO >> epoch 018: 444 / 868 loss=5.58, bleu=0, ppl=47.83, wps=2776.7, ups=4.16, wpb=667.7, bsz=64, num_updates=15200, lr=8.4e-05, gnorm=10.793, clip=100, train_wall=95, wall=6001 (progress_bar.py:260, log())
[2021-03-22 15:36:47] INFO >> epoch 018: 844 / 868 loss=5.519, bleu=0, ppl=45.85, wps=2715.8, ups=4.08, wpb=665.6, bsz=64, num_updates=15600, lr=8.4e-05, gnorm=10.563, clip=100, train_wall=97, wall=6099 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 15:36:53] INFO >> epoch 018 | loss 5.551 | bleu 0 | ppl 46.89 | wps 1734.5 | ups 2.6 | wpb 666.6 | bsz 64 | num_updates 15624 | lr 8.4e-05 | gnorm 10.682 | clip 100 | train_wall 208 | wall 6106 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 15:38:21] INFO >> epoch 018 | valid on 'valid' subset | loss 6.569 | bleu 18.9159 | ppl 94.95 | wps 2389.9 | wpb 10165.6 | bsz 973.9 | num_updates 15624 | best_bleu 18.9159 (progress_bar.py:269, print())
[2021-03-22 15:39:40] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 18 @ 15624 updates, score 18.915896) (writing took 79.352904 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 15:41:18] INFO >> epoch 019: 376 / 868 loss=5.443, bleu=0, ppl=43.52, wps=984.2, ups=1.48, wpb=667.1, bsz=64, num_updates=16000, lr=8.3e-05, gnorm=10.175, clip=100, train_wall=96, wall=6370 (progress_bar.py:260, log())
[2021-03-22 15:42:55] INFO >> epoch 019: 776 / 868 loss=5.402, bleu=0, ppl=42.29, wps=2744.5, ups=4.12, wpb=665.9, bsz=64, num_updates=16400, lr=8.3e-05, gnorm=9.933, clip=100, train_wall=96, wall=6467 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 15:43:18] INFO >> epoch 019 | loss 5.412 | bleu 0 | ppl 42.57 | wps 1503.9 | ups 2.26 | wpb 666.6 | bsz 64 | num_updates 16492 | lr 8.3e-05 | gnorm 10.022 | clip 100 | train_wall 208 | wall 6491 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 15:44:46] INFO >> epoch 019 | valid on 'valid' subset | loss 6.553 | bleu 19.0338 | ppl 93.91 | wps 2383.1 | wpb 10165.6 | bsz 973.9 | num_updates 16492 | best_bleu 19.0338 (progress_bar.py:269, print())
[2021-03-22 15:45:58] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 19 @ 16492 updates, score 19.033831) (writing took 72.423300 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 15:47:19] INFO >> epoch 020: 308 / 868 loss=5.309, bleu=0, ppl=39.65, wps=1010.9, ups=1.52, wpb=667, bsz=64, num_updates=16800, lr=8.3e-05, gnorm=9.639, clip=100, train_wall=95, wall=6731 (progress_bar.py:260, log())
[2021-03-22 15:48:56] INFO >> epoch 020: 708 / 868 loss=5.276, bleu=0, ppl=38.74, wps=2748.1, ups=4.13, wpb=665.9, bsz=64, num_updates=17200, lr=8.3e-05, gnorm=9.468, clip=100, train_wall=95, wall=6828 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 15:49:36] INFO >> epoch 020 | loss 5.278 | bleu 0 | ppl 38.81 | wps 1532.8 | ups 2.3 | wpb 666.6 | bsz 64 | num_updates 17360 | lr 8.3e-05 | gnorm 9.485 | clip 100 | train_wall 207 | wall 6868 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 15:51:03] INFO >> epoch 020 | valid on 'valid' subset | loss 6.532 | bleu 19.4468 | ppl 92.54 | wps 2400.4 | wpb 10165.6 | bsz 973.9 | num_updates 17360 | best_bleu 19.4468 (progress_bar.py:269, print())
[2021-03-22 15:52:09] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 20 @ 17360 updates, score 19.446791) (writing took 66.070458 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 15:53:13] INFO >> epoch 021: 240 / 868 loss=5.206, bleu=0, ppl=36.91, wps=1037.6, ups=1.55, wpb=667.8, bsz=64, num_updates=17600, lr=8.2e-05, gnorm=9.194, clip=100, train_wall=96, wall=7086 (progress_bar.py:260, log())
[2021-03-22 15:54:50] INFO >> epoch 021: 640 / 868 loss=5.131, bleu=0, ppl=35.03, wps=2757.3, ups=4.15, wpb=665, bsz=64, num_updates=18000, lr=8.2e-05, gnorm=9.04, clip=100, train_wall=95, wall=7182 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 15:55:46] INFO >> epoch 021 | loss 5.141 | bleu 0 | ppl 35.28 | wps 1560.1 | ups 2.34 | wpb 666.6 | bsz 64 | num_updates 18228 | lr 8.2e-05 | gnorm 9.02 | clip 100 | train_wall 208 | wall 7239 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 15:57:14] INFO >> epoch 021 | valid on 'valid' subset | loss 6.537 | bleu 19.4865 | ppl 92.87 | wps 2401.8 | wpb 10165.6 | bsz 973.9 | num_updates 18228 | best_bleu 19.4865 (progress_bar.py:269, print())
[2021-03-22 15:58:29] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 21 @ 18228 updates, score 19.48645) (writing took 75.709923 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 15:59:17] INFO >> epoch 022: 172 / 868 loss=5.101, bleu=0, ppl=34.32, wps=998.2, ups=1.5, wpb=667.5, bsz=64, num_updates=18400, lr=8.1e-05, gnorm=8.804, clip=100, train_wall=96, wall=7450 (progress_bar.py:260, log())
[2021-03-22 16:00:54] INFO >> epoch 022: 572 / 868 loss=4.995, bleu=0, ppl=31.89, wps=2761.3, ups=4.15, wpb=666, bsz=64, num_updates=18800, lr=8.1e-05, gnorm=8.623, clip=100, train_wall=95, wall=7546 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 16:02:07] INFO >> epoch 022 | loss 5.014 | bleu 0 | ppl 32.3 | wps 1520.9 | ups 2.28 | wpb 666.6 | bsz 64 | num_updates 19096 | lr 8.1e-05 | gnorm 8.602 | clip 100 | train_wall 208 | wall 7619 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 16:03:34] INFO >> epoch 022 | valid on 'valid' subset | loss 6.502 | bleu 20.0127 | ppl 90.64 | wps 2408.6 | wpb 10165.6 | bsz 973.9 | num_updates 19096 | best_bleu 20.0127 (progress_bar.py:269, print())
[2021-03-22 16:04:27] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 22 @ 19096 updates, score 20.012729) (writing took 52.581295 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 16:04:58] INFO >> epoch 023: 104 / 868 loss=4.995, bleu=0, ppl=31.9, wps=1090.5, ups=1.64, wpb=666.9, bsz=64, num_updates=19200, lr=8e-05, gnorm=8.525, clip=100, train_wall=97, wall=7791 (progress_bar.py:260, log())
[2021-03-22 16:06:34] INFO >> epoch 023: 504 / 868 loss=4.891, bleu=0, ppl=29.68, wps=2775.7, ups=4.16, wpb=666.7, bsz=64, num_updates=19600, lr=8e-05, gnorm=8.339, clip=100, train_wall=95, wall=7887 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 16:08:04] INFO >> epoch 023 | loss 4.893 | bleu 0 | ppl 29.72 | wps 1618.1 | ups 2.43 | wpb 666.6 | bsz 64 | num_updates 19964 | lr 8e-05 | gnorm 8.329 | clip 100 | train_wall 208 | wall 7977 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 16:09:32] INFO >> epoch 023 | valid on 'valid' subset | loss 6.549 | bleu 20.3849 | ppl 93.64 | wps 2407.9 | wpb 10165.6 | bsz 973.9 | num_updates 19964 | best_bleu 20.3849 (progress_bar.py:269, print())
[2021-03-22 16:10:46] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 23 @ 19964 updates, score 20.384935) (writing took 74.303226 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 16:11:01] INFO >> epoch 024: 36 / 868 loss=4.864, bleu=0, ppl=29.11, wps=997.8, ups=1.5, wpb=666, bsz=64, num_updates=20000, lr=7.9e-05, gnorm=8.249, clip=100, train_wall=97, wall=8154 (progress_bar.py:260, log())
[2021-03-22 16:12:37] INFO >> epoch 024: 436 / 868 loss=4.797, bleu=0, ppl=27.79, wps=2781.1, ups=4.16, wpb=668.2, bsz=64, num_updates=20400, lr=7.9e-05, gnorm=8.079, clip=100, train_wall=95, wall=8250 (progress_bar.py:260, log())
[2021-03-22 16:14:15] INFO >> epoch 024: 836 / 868 loss=4.749, bleu=0, ppl=26.89, wps=2723.9, ups=4.09, wpb=666.2, bsz=64, num_updates=20800, lr=7.9e-05, gnorm=7.965, clip=100, train_wall=96, wall=8348 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 16:14:24] INFO >> epoch 024 | loss 4.772 | bleu 0 | ppl 27.32 | wps 1525.6 | ups 2.29 | wpb 666.6 | bsz 64 | num_updates 20832 | lr 7.9e-05 | gnorm 8.022 | clip 100 | train_wall 208 | wall 8356 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 16:15:51] INFO >> epoch 024 | valid on 'valid' subset | loss 6.477 | bleu 20.4026 | ppl 89.08 | wps 2404.5 | wpb 10165.6 | bsz 973.9 | num_updates 20832 | best_bleu 20.4026 (progress_bar.py:269, print())
[2021-03-22 16:16:58] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 24 @ 20832 updates, score 20.402612) (writing took 67.081212 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 16:18:33] INFO >> epoch 025: 368 / 868 loss=4.679, bleu=0, ppl=25.61, wps=1033.8, ups=1.55, wpb=666.8, bsz=64, num_updates=21200, lr=7.9e-05, gnorm=7.818, clip=100, train_wall=95, wall=8606 (progress_bar.py:260, log())
[2021-03-22 16:20:10] INFO >> epoch 025: 768 / 868 loss=4.654, bleu=0, ppl=25.18, wps=2749.8, ups=4.13, wpb=666, bsz=64, num_updates=21600, lr=7.9e-05, gnorm=7.718, clip=100, train_wall=95, wall=8703 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 16:20:35] INFO >> epoch 025 | loss 4.657 | bleu 0 | ppl 25.22 | wps 1557.9 | ups 2.34 | wpb 666.6 | bsz 64 | num_updates 21700 | lr 7.9e-05 | gnorm 7.76 | clip 100 | train_wall 207 | wall 8728 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 16:22:02] INFO >> epoch 025 | valid on 'valid' subset | loss 6.49 | bleu 20.737 | ppl 89.87 | wps 2384.2 | wpb 10165.6 | bsz 973.9 | num_updates 21700 | best_bleu 20.737 (progress_bar.py:269, print())
[2021-03-22 16:22:55] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 25 @ 21700 updates, score 20.73696) (writing took 52.805019 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 16:24:15] INFO >> epoch 026: 300 / 868 loss=4.566, bleu=0, ppl=23.69, wps=1090.4, ups=1.64, wpb=666.9, bsz=64, num_updates=22000, lr=7.8e-05, gnorm=7.646, clip=100, train_wall=96, wall=8947 (progress_bar.py:260, log())
[2021-03-22 16:25:52] INFO >> epoch 026: 700 / 868 loss=4.547, bleu=0, ppl=23.37, wps=2739.6, ups=4.12, wpb=665.5, bsz=64, num_updates=22400, lr=7.8e-05, gnorm=7.589, clip=100, train_wall=96, wall=9044 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 16:26:34] INFO >> epoch 026 | loss 4.544 | bleu 0 | ppl 23.33 | wps 1612.9 | ups 2.42 | wpb 666.6 | bsz 64 | num_updates 22568 | lr 7.8e-05 | gnorm 7.577 | clip 100 | train_wall 208 | wall 9086 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 16:28:01] INFO >> epoch 026 | valid on 'valid' subset | loss 6.499 | bleu 21.1331 | ppl 90.45 | wps 2412.1 | wpb 10165.6 | bsz 973.9 | num_updates 22568 | best_bleu 21.1331 (progress_bar.py:269, print())
[2021-03-22 16:29:21] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 26 @ 22568 updates, score 21.133116) (writing took 79.777086 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 16:30:24] INFO >> epoch 027: 232 / 868 loss=4.489, bleu=0, ppl=22.45, wps=984.3, ups=1.47, wpb=668.6, bsz=64, num_updates=22800, lr=7.7e-05, gnorm=7.404, clip=100, train_wall=96, wall=9316 (progress_bar.py:260, log())
[2021-03-22 16:32:00] INFO >> epoch 027: 632 / 868 loss=4.431, bleu=0, ppl=21.57, wps=2755.7, ups=4.15, wpb=664.6, bsz=64, num_updates=23200, lr=7.7e-05, gnorm=7.443, clip=100, train_wall=95, wall=9413 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 16:32:59] INFO >> epoch 027 | loss 4.436 | bleu 0 | ppl 21.65 | wps 1502.2 | ups 2.25 | wpb 666.6 | bsz 64 | num_updates 23436 | lr 7.7e-05 | gnorm 7.385 | clip 100 | train_wall 208 | wall 9472 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 16:34:28] INFO >> epoch 027 | valid on 'valid' subset | loss 6.517 | bleu 21.2966 | ppl 91.61 | wps 2370.9 | wpb 10165.6 | bsz 973.9 | num_updates 23436 | best_bleu 21.2966 (progress_bar.py:269, print())
[2021-03-22 16:35:46] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 27 @ 23436 updates, score 21.29659) (writing took 78.410628 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 16:36:32] INFO >> epoch 028: 164 / 868 loss=4.402, bleu=0, ppl=21.14, wps=980.8, ups=1.47, wpb=667.8, bsz=64, num_updates=23600, lr=7.6e-05, gnorm=7.31, clip=100, train_wall=97, wall=9685 (progress_bar.py:260, log())
[2021-03-22 16:38:09] INFO >> epoch 028: 564 / 868 loss=4.312, bleu=0, ppl=19.86, wps=2750.2, ups=4.13, wpb=665.5, bsz=64, num_updates=24000, lr=7.6e-05, gnorm=7.271, clip=100, train_wall=95, wall=9782 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 16:39:24] INFO >> epoch 028 | loss 4.33 | bleu 0 | ppl 20.12 | wps 1501.8 | ups 2.25 | wpb 666.6 | bsz 64 | num_updates 24304 | lr 7.6e-05 | gnorm 7.26 | clip 100 | train_wall 208 | wall 9857 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 16:40:52] INFO >> epoch 028 | valid on 'valid' subset | loss 6.593 | bleu 21.5289 | ppl 96.55 | wps 2389.2 | wpb 10165.6 | bsz 973.9 | num_updates 24304 | best_bleu 21.5289 (progress_bar.py:269, print())
[2021-03-22 16:41:45] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 28 @ 24304 updates, score 21.528895) (writing took 52.572291 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 16:42:14] INFO >> epoch 029: 96 / 868 loss=4.323, bleu=0, ppl=20.02, wps=1088.3, ups=1.63, wpb=667.1, bsz=64, num_updates=24400, lr=7.5e-05, gnorm=7.203, clip=100, train_wall=96, wall=10027 (progress_bar.py:260, log())
[2021-03-22 16:43:50] INFO >> epoch 029: 496 / 868 loss=4.218, bleu=0, ppl=18.61, wps=2783.8, ups=4.17, wpb=667.4, bsz=64, num_updates=24800, lr=7.5e-05, gnorm=7.169, clip=100, train_wall=94, wall=10123 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 16:45:23] INFO >> epoch 029 | loss 4.224 | bleu 0 | ppl 18.69 | wps 1614.9 | ups 2.42 | wpb 666.6 | bsz 64 | num_updates 25172 | lr 7.5e-05 | gnorm 7.146 | clip 100 | train_wall 208 | wall 10215 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 16:46:51] INFO >> epoch 029 | valid on 'valid' subset | loss 6.547 | bleu 21.5051 | ppl 93.5 | wps 2355.6 | wpb 10165.6 | bsz 973.9 | num_updates 25172 | best_bleu 21.5289 (progress_bar.py:269, print())
[2021-03-22 16:47:40] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_last.pt (epoch 29 @ 25172 updates, score 21.505074) (writing took 49.100115 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 16:47:54] INFO >> epoch 030: 28 / 868 loss=4.198, bleu=0, ppl=18.36, wps=1093, ups=1.64, wpb=665.2, bsz=64, num_updates=25200, lr=7.5e-05, gnorm=7.122, clip=100, train_wall=97, wall=10366 (progress_bar.py:260, log())
[2021-03-22 16:49:30] INFO >> epoch 030: 428 / 868 loss=4.141, bleu=0, ppl=17.65, wps=2781.2, ups=4.16, wpb=668.3, bsz=64, num_updates=25600, lr=7.5e-05, gnorm=7.044, clip=100, train_wall=95, wall=10462 (progress_bar.py:260, log())
[2021-03-22 16:51:07] INFO >> epoch 030: 828 / 868 loss=4.114, bleu=0, ppl=17.32, wps=2731.1, ups=4.1, wpb=665.8, bsz=64, num_updates=26000, lr=7.5e-05, gnorm=7.024, clip=100, train_wall=96, wall=10560 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 16:51:18] INFO >> epoch 030 | loss 4.122 | bleu 0 | ppl 17.42 | wps 1628.8 | ups 2.44 | wpb 666.6 | bsz 64 | num_updates 26040 | lr 7.5e-05 | gnorm 7.033 | clip 100 | train_wall 207 | wall 10570 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 16:52:44] INFO >> epoch 030 | valid on 'valid' subset | loss 6.561 | bleu 21.9499 | ppl 94.42 | wps 2416.1 | wpb 10165.6 | bsz 973.9 | num_updates 26040 | best_bleu 21.9499 (progress_bar.py:269, print())
[2021-03-22 16:53:50] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 30 @ 26040 updates, score 21.949926) (writing took 65.333668 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 16:55:22] INFO >> epoch 031: 360 / 868 loss=4.039, bleu=0, ppl=16.44, wps=1047.1, ups=1.57, wpb=666.8, bsz=64, num_updates=26400, lr=7.4e-05, gnorm=6.971, clip=100, train_wall=94, wall=10815 (progress_bar.py:260, log())
[2021-03-22 16:56:59] INFO >> epoch 031: 760 / 868 loss=4.029, bleu=0, ppl=16.32, wps=2747.5, ups=4.12, wpb=666.4, bsz=64, num_updates=26800, lr=7.4e-05, gnorm=6.929, clip=100, train_wall=96, wall=10912 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 16:57:26] INFO >> epoch 031 | loss 4.025 | bleu 0 | ppl 16.28 | wps 1570.7 | ups 2.36 | wpb 666.6 | bsz 64 | num_updates 26908 | lr 7.4e-05 | gnorm 6.943 | clip 100 | train_wall 206 | wall 10939 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 16:58:53] INFO >> epoch 031 | valid on 'valid' subset | loss 6.602 | bleu 22.2201 | ppl 97.13 | wps 2427.7 | wpb 10165.6 | bsz 973.9 | num_updates 26908 | best_bleu 22.2201 (progress_bar.py:269, print())
[2021-03-22 16:59:49] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 31 @ 26908 updates, score 22.220125) (writing took 56.610767 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 17:01:06] INFO >> epoch 032: 292 / 868 loss=3.944, bleu=0, ppl=15.39, wps=1079.8, ups=1.62, wpb=667, bsz=64, num_updates=27200, lr=7.3e-05, gnorm=6.903, clip=100, train_wall=95, wall=11159 (progress_bar.py:260, log())
[2021-03-22 17:02:43] INFO >> epoch 032: 692 / 868 loss=3.917, bleu=0, ppl=15.11, wps=2743.4, ups=4.12, wpb=665.2, bsz=64, num_updates=27600, lr=7.3e-05, gnorm=6.873, clip=100, train_wall=95, wall=11256 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 17:03:27] INFO >> epoch 032 | loss 3.922 | bleu 0 | ppl 15.16 | wps 1604 | ups 2.41 | wpb 666.6 | bsz 64 | num_updates 27776 | lr 7.3e-05 | gnorm 6.874 | clip 100 | train_wall 207 | wall 11300 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 17:04:54] INFO >> epoch 032 | valid on 'valid' subset | loss 6.595 | bleu 22.3027 | ppl 96.67 | wps 2396.7 | wpb 10165.6 | bsz 973.9 | num_updates 27776 | best_bleu 22.3027 (progress_bar.py:269, print())
[2021-03-22 17:05:47] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 32 @ 27776 updates, score 22.302691) (writing took 52.845703 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 17:06:48] INFO >> epoch 033: 224 / 868 loss=3.882, bleu=0, ppl=14.74, wps=1092.6, ups=1.63, wpb=668.5, bsz=64, num_updates=28000, lr=7.2e-05, gnorm=6.77, clip=100, train_wall=96, wall=11501 (progress_bar.py:260, log())
[2021-03-22 17:08:24] INFO >> epoch 033: 624 / 868 loss=3.825, bleu=0, ppl=14.18, wps=2768.5, ups=4.16, wpb=664.8, bsz=64, num_updates=28400, lr=7.2e-05, gnorm=6.782, clip=100, train_wall=95, wall=11597 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 17:09:25] INFO >> epoch 033 | loss 3.832 | bleu 0 | ppl 14.25 | wps 1617.9 | ups 2.43 | wpb 666.6 | bsz 64 | num_updates 28644 | lr 7.2e-05 | gnorm 6.779 | clip 100 | train_wall 207 | wall 11657 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 17:10:52] INFO >> epoch 033 | valid on 'valid' subset | loss 6.71 | bleu 22.2734 | ppl 104.67 | wps 2390.7 | wpb 10165.6 | bsz 973.9 | num_updates 28644 | best_bleu 22.3027 (progress_bar.py:269, print())
[2021-03-22 17:11:20] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_last.pt (epoch 33 @ 28644 updates, score 22.273432) (writing took 27.903230 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 17:12:04] INFO >> epoch 034: 156 / 868 loss=3.81, bleu=0, ppl=14.02, wps=1213, ups=1.82, wpb=668.1, bsz=64, num_updates=28800, lr=7.2e-05, gnorm=6.807, clip=100, train_wall=96, wall=11817 (progress_bar.py:260, log())
[2021-03-22 17:13:41] INFO >> epoch 034: 556 / 868 loss=3.719, bleu=0, ppl=13.17, wps=2757.7, ups=4.14, wpb=665.4, bsz=64, num_updates=29200, lr=7.2e-05, gnorm=6.753, clip=100, train_wall=95, wall=11913 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 17:14:58] INFO >> epoch 034 | loss 3.743 | bleu 0 | ppl 13.39 | wps 1734.7 | ups 2.6 | wpb 666.6 | bsz 64 | num_updates 29512 | lr 7.2e-05 | gnorm 6.766 | clip 100 | train_wall 208 | wall 11991 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 17:16:25] INFO >> epoch 034 | valid on 'valid' subset | loss 6.666 | bleu 22.7828 | ppl 101.57 | wps 2415 | wpb 10165.6 | bsz 973.9 | num_updates 29512 | best_bleu 22.7828 (progress_bar.py:269, print())
[2021-03-22 17:17:18] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 34 @ 29512 updates, score 22.782847) (writing took 52.970474 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 17:17:46] INFO >> epoch 035: 88 / 868 loss=3.736, bleu=0, ppl=13.32, wps=1087.2, ups=1.63, wpb=666.7, bsz=64, num_updates=29600, lr=7.1e-05, gnorm=6.761, clip=100, train_wall=97, wall=12159 (progress_bar.py:260, log())
[2021-03-22 17:19:22] INFO >> epoch 035: 488 / 868 loss=3.646, bleu=0, ppl=12.52, wps=2789.1, ups=4.17, wpb=668.1, bsz=64, num_updates=30000, lr=7.1e-05, gnorm=6.716, clip=100, train_wall=94, wall=12255 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 17:20:56] INFO >> epoch 035 | loss 3.652 | bleu 0 | ppl 12.57 | wps 1617.6 | ups 2.43 | wpb 666.6 | bsz 64 | num_updates 30380 | lr 7.1e-05 | gnorm 6.71 | clip 100 | train_wall 208 | wall 12348 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 17:22:23] INFO >> epoch 035 | valid on 'valid' subset | loss 6.713 | bleu 22.6521 | ppl 104.93 | wps 2408.2 | wpb 10165.6 | bsz 973.9 | num_updates 30380 | best_bleu 22.7828 (progress_bar.py:269, print())
[2021-03-22 17:22:50] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_last.pt (epoch 35 @ 30380 updates, score 22.652127) (writing took 27.420224 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 17:23:02] INFO >> epoch 036: 20 / 868 loss=3.635, bleu=0, ppl=12.43, wps=1209.1, ups=1.82, wpb=664.8, bsz=64, num_updates=30400, lr=7e-05, gnorm=6.704, clip=100, train_wall=97, wall=12474 (progress_bar.py:260, log())
[2021-03-22 17:24:38] INFO >> epoch 036: 420 / 868 loss=3.581, bleu=0, ppl=11.97, wps=2776.4, ups=4.15, wpb=668.4, bsz=64, num_updates=30800, lr=7e-05, gnorm=6.664, clip=100, train_wall=95, wall=12571 (progress_bar.py:260, log())
[2021-03-22 17:26:15] INFO >> epoch 036: 820 / 868 loss=3.561, bleu=0, ppl=11.8, wps=2740.7, ups=4.12, wpb=666, bsz=64, num_updates=31200, lr=7e-05, gnorm=6.673, clip=100, train_wall=96, wall=12668 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 17:26:28] INFO >> epoch 036 | loss 3.561 | bleu 0 | ppl 11.8 | wps 1742 | ups 2.61 | wpb 666.6 | bsz 64 | num_updates 31248 | lr 7e-05 | gnorm 6.685 | clip 100 | train_wall 207 | wall 12681 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 17:27:55] INFO >> epoch 036 | valid on 'valid' subset | loss 6.659 | bleu 23.0079 | ppl 101.04 | wps 2415.4 | wpb 10165.6 | bsz 973.9 | num_updates 31248 | best_bleu 23.0079 (progress_bar.py:269, print())
[2021-03-22 17:28:48] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 36 @ 31248 updates, score 23.007909) (writing took 52.921332 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 17:30:19] INFO >> epoch 037: 352 / 868 loss=3.477, bleu=0, ppl=11.14, wps=1093.3, ups=1.64, wpb=666.7, bsz=64, num_updates=31600, lr=7e-05, gnorm=6.712, clip=100, train_wall=95, wall=12912 (progress_bar.py:260, log())
[2021-03-22 17:31:56] INFO >> epoch 037: 752 / 868 loss=3.488, bleu=0, ppl=11.22, wps=2763.5, ups=4.15, wpb=665.8, bsz=64, num_updates=32000, lr=7e-05, gnorm=6.645, clip=100, train_wall=95, wall=13008 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 17:32:25] INFO >> epoch 037 | loss 3.479 | bleu 0 | ppl 11.15 | wps 1620.4 | ups 2.43 | wpb 666.6 | bsz 64 | num_updates 32116 | lr 7e-05 | gnorm 6.648 | clip 100 | train_wall 207 | wall 13038 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 17:33:51] INFO >> epoch 037 | valid on 'valid' subset | loss 6.626 | bleu 23.0702 | ppl 98.74 | wps 2427.5 | wpb 10165.6 | bsz 973.9 | num_updates 32116 | best_bleu 23.0702 (progress_bar.py:269, print())
[2021-03-22 17:34:44] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 37 @ 32116 updates, score 23.070207) (writing took 52.429675 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 17:35:59] INFO >> epoch 038: 284 / 868 loss=3.413, bleu=0, ppl=10.65, wps=1094.1, ups=1.64, wpb=667, bsz=64, num_updates=32400, lr=6.9e-05, gnorm=6.59, clip=100, train_wall=96, wall=13252 (progress_bar.py:260, log())
[2021-03-22 17:37:36] INFO >> epoch 038: 684 / 868 loss=3.392, bleu=0, ppl=10.5, wps=2746.6, ups=4.13, wpb=665.7, bsz=64, num_updates=32800, lr=6.9e-05, gnorm=6.649, clip=100, train_wall=95, wall=13349 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 17:38:22] INFO >> epoch 038 | loss 3.392 | bleu 0 | ppl 10.5 | wps 1620.4 | ups 2.43 | wpb 666.6 | bsz 64 | num_updates 32984 | lr 6.9e-05 | gnorm 6.604 | clip 100 | train_wall 208 | wall 13395 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 17:39:48] INFO >> epoch 038 | valid on 'valid' subset | loss 6.716 | bleu 23.2151 | ppl 105.16 | wps 2431.5 | wpb 10165.6 | bsz 973.9 | num_updates 32984 | best_bleu 23.2151 (progress_bar.py:269, print())
[2021-03-22 17:40:41] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 38 @ 32984 updates, score 23.215106) (writing took 52.648426 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 17:41:40] INFO >> epoch 039: 216 / 868 loss=3.344, bleu=0, ppl=10.15, wps=1096.3, ups=1.64, wpb=668.3, bsz=64, num_updates=33200, lr=6.8e-05, gnorm=6.519, clip=100, train_wall=96, wall=13593 (progress_bar.py:260, log())
[2021-03-22 17:43:17] INFO >> epoch 039: 616 / 868 loss=3.306, bleu=0, ppl=9.89, wps=2760.2, ups=4.15, wpb=665.1, bsz=64, num_updates=33600, lr=6.8e-05, gnorm=6.63, clip=100, train_wall=95, wall=13689 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 17:44:19] INFO >> epoch 039 | loss 3.309 | bleu 0 | ppl 9.91 | wps 1620.3 | ups 2.43 | wpb 666.6 | bsz 64 | num_updates 33852 | lr 6.8e-05 | gnorm 6.59 | clip 100 | train_wall 208 | wall 13752 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 17:45:46] INFO >> epoch 039 | valid on 'valid' subset | loss 6.745 | bleu 23.197 | ppl 107.29 | wps 2412.4 | wpb 10165.6 | bsz 973.9 | num_updates 33852 | best_bleu 23.2151 (progress_bar.py:269, print())
[2021-03-22 17:46:14] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_last.pt (epoch 39 @ 33852 updates, score 23.197005) (writing took 27.528289 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 17:46:57] INFO >> epoch 040: 148 / 868 loss=3.29, bleu=0, ppl=9.78, wps=1214.5, ups=1.82, wpb=668.2, bsz=64, num_updates=34000, lr=6.8e-05, gnorm=6.556, clip=100, train_wall=97, wall=13909 (progress_bar.py:260, log())
[2021-03-22 17:48:33] INFO >> epoch 040: 548 / 868 loss=3.212, bleu=0, ppl=9.26, wps=2753.6, ups=4.14, wpb=665, bsz=64, num_updates=34400, lr=6.8e-05, gnorm=6.576, clip=100, train_wall=95, wall=14006 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 17:49:53] INFO >> epoch 040 | loss 3.225 | bleu 0 | ppl 9.35 | wps 1734.8 | ups 2.6 | wpb 666.6 | bsz 64 | num_updates 34720 | lr 6.8e-05 | gnorm 6.56 | clip 100 | train_wall 208 | wall 14085 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 17:51:19] INFO >> epoch 040 | valid on 'valid' subset | loss 6.781 | bleu 23.6943 | ppl 110 | wps 2442.2 | wpb 10165.6 | bsz 973.9 | num_updates 34720 | best_bleu 23.6943 (progress_bar.py:269, print())
[2021-03-22 17:52:11] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 40 @ 34720 updates, score 23.694322) (writing took 52.713078 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 17:52:37] INFO >> epoch 041: 80 / 868 loss=3.22, bleu=0, ppl=9.32, wps=1093, ups=1.64, wpb=666.6, bsz=64, num_updates=34800, lr=6.7e-05, gnorm=6.567, clip=100, train_wall=97, wall=14250 (progress_bar.py:260, log())
[2021-03-22 17:54:14] INFO >> epoch 041: 480 / 868 loss=3.147, bleu=0, ppl=8.86, wps=2772.6, ups=4.15, wpb=667.9, bsz=64, num_updates=35200, lr=6.7e-05, gnorm=6.535, clip=100, train_wall=95, wall=14346 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 17:55:50] INFO >> epoch 041 | loss 3.152 | bleu 0 | ppl 8.89 | wps 1620.2 | ups 2.43 | wpb 666.6 | bsz 64 | num_updates 35588 | lr 6.7e-05 | gnorm 6.535 | clip 100 | train_wall 208 | wall 14443 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 17:57:17] INFO >> epoch 041 | valid on 'valid' subset | loss 6.727 | bleu 23.5184 | ppl 105.93 | wps 2425 | wpb 10165.6 | bsz 973.9 | num_updates 35588 | best_bleu 23.6943 (progress_bar.py:269, print())
[2021-03-22 17:57:44] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_last.pt (epoch 41 @ 35588 updates, score 23.518381) (writing took 26.828031 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 17:57:53] INFO >> epoch 042: 12 / 868 loss=3.142, bleu=0, ppl=8.82, wps=1210.5, ups=1.82, wpb=665.3, bsz=64, num_updates=35600, lr=6.6e-05, gnorm=6.523, clip=100, train_wall=97, wall=14566 (progress_bar.py:260, log())
[2021-03-22 17:59:30] INFO >> epoch 042: 412 / 868 loss=3.096, bleu=0, ppl=8.55, wps=2775.5, ups=4.15, wpb=668.1, bsz=64, num_updates=36000, lr=6.6e-05, gnorm=6.552, clip=100, train_wall=95, wall=14662 (progress_bar.py:260, log())
[2021-03-22 18:01:07] INFO >> epoch 042: 812 / 868 loss=3.075, bleu=0, ppl=8.43, wps=2733, ups=4.1, wpb=666.2, bsz=64, num_updates=36400, lr=6.6e-05, gnorm=6.54, clip=100, train_wall=96, wall=14760 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 18:01:22] INFO >> epoch 042 | loss 3.075 | bleu 0 | ppl 8.43 | wps 1742.3 | ups 2.61 | wpb 666.6 | bsz 64 | num_updates 36456 | lr 6.6e-05 | gnorm 6.551 | clip 100 | train_wall 208 | wall 14775 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 18:02:50] INFO >> epoch 042 | valid on 'valid' subset | loss 6.827 | bleu 23.4429 | ppl 113.53 | wps 2402 | wpb 10165.6 | bsz 973.9 | num_updates 36456 | best_bleu 23.6943 (progress_bar.py:269, print())
[2021-03-22 18:03:17] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_last.pt (epoch 42 @ 36456 updates, score 23.442892) (writing took 27.408127 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 18:04:47] INFO >> epoch 043: 344 / 868 loss=2.995, bleu=0, ppl=7.97, wps=1215.3, ups=1.82, wpb=666.8, bsz=64, num_updates=36800, lr=6.6e-05, gnorm=6.56, clip=100, train_wall=96, wall=14979 (progress_bar.py:260, log())
[2021-03-22 18:06:24] INFO >> epoch 043: 744 / 868 loss=3.004, bleu=0, ppl=8.02, wps=2736.8, ups=4.11, wpb=665.3, bsz=64, num_updates=37200, lr=6.6e-05, gnorm=6.59, clip=100, train_wall=96, wall=15077 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 18:06:55] INFO >> epoch 043 | loss 2.997 | bleu 0 | ppl 7.98 | wps 1735.5 | ups 2.6 | wpb 666.6 | bsz 64 | num_updates 37324 | lr 6.6e-05 | gnorm 6.559 | clip 100 | train_wall 208 | wall 15108 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 18:08:23] INFO >> epoch 043 | valid on 'valid' subset | loss 6.882 | bleu 23.6709 | ppl 117.94 | wps 2405.2 | wpb 10165.6 | bsz 973.9 | num_updates 37324 | best_bleu 23.6943 (progress_bar.py:269, print())
[2021-03-22 18:08:51] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_last.pt (epoch 43 @ 37324 updates, score 23.670899) (writing took 27.690498 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 18:10:04] INFO >> epoch 044: 276 / 868 loss=2.942, bleu=0, ppl=7.69, wps=1212.1, ups=1.81, wpb=667.9, bsz=64, num_updates=37600, lr=6.5e-05, gnorm=6.482, clip=100, train_wall=96, wall=15297 (progress_bar.py:260, log())
[2021-03-22 18:11:41] INFO >> epoch 044: 676 / 868 loss=2.936, bleu=0, ppl=7.65, wps=2749.6, ups=4.13, wpb=665.6, bsz=64, num_updates=38000, lr=6.5e-05, gnorm=6.552, clip=100, train_wall=95, wall=15394 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 18:12:30] INFO >> epoch 044 | loss 2.929 | bleu 0 | ppl 7.61 | wps 1732 | ups 2.6 | wpb 666.6 | bsz 64 | num_updates 38192 | lr 6.5e-05 | gnorm 6.524 | clip 100 | train_wall 208 | wall 15442 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 18:13:57] INFO >> epoch 044 | valid on 'valid' subset | loss 6.892 | bleu 23.7384 | ppl 118.75 | wps 2396.3 | wpb 10165.6 | bsz 973.9 | num_updates 38192 | best_bleu 23.7384 (progress_bar.py:269, print())
[2021-03-22 18:14:49] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 44 @ 38192 updates, score 23.738404) (writing took 51.968711 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 18:15:46] INFO >> epoch 045: 208 / 868 loss=2.886, bleu=0, ppl=7.39, wps=1090.6, ups=1.63, wpb=667.8, bsz=64, num_updates=38400, lr=6.4e-05, gnorm=6.509, clip=100, train_wall=97, wall=15639 (progress_bar.py:260, log())
[2021-03-22 18:17:22] INFO >> epoch 045: 608 / 868 loss=2.857, bleu=0, ppl=7.25, wps=2762.4, ups=4.15, wpb=665.2, bsz=64, num_updates=38800, lr=6.4e-05, gnorm=6.522, clip=100, train_wall=95, wall=15735 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 18:18:27] INFO >> epoch 045 | loss 2.859 | bleu 0 | ppl 7.26 | wps 1618.6 | ups 2.43 | wpb 666.6 | bsz 64 | num_updates 39060 | lr 6.4e-05 | gnorm 6.507 | clip 100 | train_wall 208 | wall 15800 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 18:19:55] INFO >> epoch 045 | valid on 'valid' subset | loss 6.966 | bleu 23.9414 | ppl 125.06 | wps 2377.6 | wpb 10165.6 | bsz 973.9 | num_updates 39060 | best_bleu 23.9414 (progress_bar.py:269, print())
[2021-03-22 18:20:49] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 45 @ 39060 updates, score 23.941433) (writing took 53.590834 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 18:21:29] INFO >> epoch 046: 140 / 868 loss=2.844, bleu=0, ppl=7.18, wps=1083.6, ups=1.62, wpb=668.4, bsz=64, num_updates=39200, lr=6.4e-05, gnorm=6.494, clip=100, train_wall=96, wall=15982 (progress_bar.py:260, log())
[2021-03-22 18:23:06] INFO >> epoch 046: 540 / 868 loss=2.779, bleu=0, ppl=6.87, wps=2756.5, ups=4.14, wpb=665.5, bsz=64, num_updates=39600, lr=6.4e-05, gnorm=6.492, clip=100, train_wall=95, wall=16078 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 18:24:27] INFO >> epoch 046 | loss 2.788 | bleu 0 | ppl 6.91 | wps 1606.8 | ups 2.41 | wpb 666.6 | bsz 64 | num_updates 39928 | lr 6.4e-05 | gnorm 6.496 | clip 100 | train_wall 208 | wall 16160 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 18:25:56] INFO >> epoch 046 | valid on 'valid' subset | loss 7.08 | bleu 23.7269 | ppl 135.31 | wps 2359.3 | wpb 10165.6 | bsz 973.9 | num_updates 39928 | best_bleu 23.9414 (progress_bar.py:269, print())
[2021-03-22 18:26:24] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_last.pt (epoch 46 @ 39928 updates, score 23.726856) (writing took 27.272066 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 18:26:48] INFO >> epoch 047: 72 / 868 loss=2.777, bleu=0, ppl=6.85, wps=1200.2, ups=1.8, wpb=666.1, bsz=64, num_updates=40000, lr=6.3e-05, gnorm=6.523, clip=100, train_wall=97, wall=16300 (progress_bar.py:260, log())
[2021-03-22 18:28:24] INFO >> epoch 047: 472 / 868 loss=2.713, bleu=0, ppl=6.56, wps=2773.7, ups=4.15, wpb=667.9, bsz=64, num_updates=40400, lr=6.3e-05, gnorm=6.517, clip=100, train_wall=95, wall=16397 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 18:30:02] INFO >> epoch 047 | loss 2.722 | bleu 0 | ppl 6.6 | wps 1728.6 | ups 2.59 | wpb 666.6 | bsz 64 | num_updates 40796 | lr 6.3e-05 | gnorm 6.521 | clip 100 | train_wall 208 | wall 16494 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 18:31:30] INFO >> epoch 047 | valid on 'valid' subset | loss 7.146 | bleu 24.1224 | ppl 141.61 | wps 2390.5 | wpb 10165.6 | bsz 973.9 | num_updates 40796 | best_bleu 24.1224 (progress_bar.py:269, print())
[2021-03-22 18:32:29] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 47 @ 40796 updates, score 24.122396) (writing took 59.344024 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 18:32:37] INFO >> epoch 048: 4 / 868 loss=2.726, bleu=0, ppl=6.61, wps=1052.6, ups=1.58, wpb=665.6, bsz=64, num_updates=40800, lr=6.2e-05, gnorm=6.511, clip=100, train_wall=97, wall=16650 (progress_bar.py:260, log())
[2021-03-22 18:34:13] INFO >> epoch 048: 404 / 868 loss=2.667, bleu=0, ppl=6.35, wps=2771.4, ups=4.15, wpb=667.3, bsz=64, num_updates=41200, lr=6.2e-05, gnorm=6.489, clip=100, train_wall=95, wall=16746 (progress_bar.py:260, log())
[2021-03-22 18:35:51] INFO >> epoch 048: 804 / 868 loss=2.667, bleu=0, ppl=6.35, wps=2740, ups=4.11, wpb=666.9, bsz=64, num_updates=41600, lr=6.2e-05, gnorm=6.53, clip=100, train_wall=96, wall=16843 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 18:36:07] INFO >> epoch 048 | loss 2.657 | bleu 0 | ppl 6.31 | wps 1583.9 | ups 2.38 | wpb 666.6 | bsz 64 | num_updates 41664 | lr 6.2e-05 | gnorm 6.513 | clip 100 | train_wall 207 | wall 16860 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 18:37:36] INFO >> epoch 048 | valid on 'valid' subset | loss 7.237 | bleu 24.2701 | ppl 150.82 | wps 2347.2 | wpb 10165.6 | bsz 973.9 | num_updates 41664 | best_bleu 24.2701 (progress_bar.py:269, print())
[2021-03-22 18:38:29] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 48 @ 41664 updates, score 24.270143) (writing took 52.758319 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 18:39:57] INFO >> epoch 049: 336 / 868 loss=2.586, bleu=0, ppl=6, wps=1083.6, ups=1.63, wpb=666.6, bsz=64, num_updates=42000, lr=6.2e-05, gnorm=6.488, clip=100, train_wall=96, wall=17089 (progress_bar.py:260, log())
[2021-03-22 18:41:34] INFO >> epoch 049: 736 / 868 loss=2.601, bleu=0, ppl=6.07, wps=2733, ups=4.11, wpb=665.3, bsz=64, num_updates=42400, lr=6.2e-05, gnorm=6.525, clip=100, train_wall=96, wall=17187 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 18:42:07] INFO >> epoch 049 | loss 2.592 | bleu 0 | ppl 6.03 | wps 1606.5 | ups 2.41 | wpb 666.6 | bsz 64 | num_updates 42532 | lr 6.2e-05 | gnorm 6.501 | clip 100 | train_wall 208 | wall 17220 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 18:43:36] INFO >> epoch 049 | valid on 'valid' subset | loss 7.246 | bleu 24.3414 | ppl 151.84 | wps 2350.5 | wpb 10165.6 | bsz 973.9 | num_updates 42532 | best_bleu 24.3414 (progress_bar.py:269, print())
[2021-03-22 18:44:29] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 49 @ 42532 updates, score 24.341437) (writing took 52.549246 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 18:45:40] INFO >> epoch 050: 268 / 868 loss=2.539, bleu=0, ppl=5.81, wps=1086.8, ups=1.63, wpb=667.6, bsz=64, num_updates=42800, lr=6.1e-05, gnorm=6.485, clip=100, train_wall=95, wall=17432 (progress_bar.py:260, log())
[2021-03-22 18:47:16] INFO >> epoch 050: 668 / 868 loss=2.534, bleu=0, ppl=5.79, wps=2760.6, ups=4.15, wpb=665.7, bsz=64, num_updates=43200, lr=6.1e-05, gnorm=6.547, clip=100, train_wall=95, wall=17529 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 18:48:06] INFO >> epoch 050 | loss 2.527 | bleu 0 | ppl 5.76 | wps 1611 | ups 2.42 | wpb 666.6 | bsz 64 | num_updates 43400 | lr 6.1e-05 | gnorm 6.51 | clip 100 | train_wall 207 | wall 17579 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 18:49:36] INFO >> epoch 050 | valid on 'valid' subset | loss 7.36 | bleu 24.7153 | ppl 164.24 | wps 2364.2 | wpb 10165.6 | bsz 973.9 | num_updates 43400 | best_bleu 24.7153 (progress_bar.py:269, print())
[2021-03-22 18:51:10] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 50 @ 43400 updates, score 24.71527) (writing took 93.991779 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 18:52:05] INFO >> epoch 051: 200 / 868 loss=2.49, bleu=0, ppl=5.62, wps=924.2, ups=1.38, wpb=667.7, bsz=64, num_updates=43600, lr=6.1e-05, gnorm=6.466, clip=100, train_wall=96, wall=17818 (progress_bar.py:260, log())
[2021-03-22 18:53:42] INFO >> epoch 051: 600 / 868 loss=2.464, bleu=0, ppl=5.52, wps=2754.1, ups=4.14, wpb=665.5, bsz=64, num_updates=44000, lr=6.1e-05, gnorm=6.53, clip=100, train_wall=95, wall=17915 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 18:54:49] INFO >> epoch 051 | loss 2.464 | bleu 0 | ppl 5.52 | wps 1439.1 | ups 2.16 | wpb 666.6 | bsz 64 | num_updates 44268 | lr 6.1e-05 | gnorm 6.502 | clip 100 | train_wall 208 | wall 17981 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 18:56:18] INFO >> epoch 051 | valid on 'valid' subset | loss 7.442 | bleu 24.491 | ppl 173.89 | wps 2342.2 | wpb 10165.6 | bsz 973.9 | num_updates 44268 | best_bleu 24.7153 (progress_bar.py:269, print())
[2021-03-22 18:56:50] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_last.pt (epoch 51 @ 44268 updates, score 24.490963) (writing took 31.711436 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 18:57:29] INFO >> epoch 052: 132 / 868 loss=2.461, bleu=0, ppl=5.51, wps=1178.2, ups=1.76, wpb=668.2, bsz=64, num_updates=44400, lr=6e-05, gnorm=6.478, clip=100, train_wall=97, wall=18141 (progress_bar.py:260, log())
[2021-03-22 18:59:06] INFO >> epoch 052: 532 / 868 loss=2.394, bleu=0, ppl=5.26, wps=2747.2, ups=4.13, wpb=665.3, bsz=64, num_updates=44800, lr=6e-05, gnorm=6.512, clip=100, train_wall=95, wall=18238 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 19:00:29] INFO >> epoch 052 | loss 2.407 | bleu 0 | ppl 5.31 | wps 1700.9 | ups 2.55 | wpb 666.6 | bsz 64 | num_updates 45136 | lr 6e-05 | gnorm 6.481 | clip 100 | train_wall 208 | wall 18321 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 19:01:58] INFO >> epoch 052 | valid on 'valid' subset | loss 7.477 | bleu 25.0289 | ppl 178.16 | wps 2378.9 | wpb 10165.6 | bsz 973.9 | num_updates 45136 | best_bleu 25.0289 (progress_bar.py:269, print())
[2021-03-22 19:02:57] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 52 @ 45136 updates, score 25.028893) (writing took 59.173600 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 19:03:19] INFO >> epoch 053: 64 / 868 loss=2.394, bleu=0, ppl=5.26, wps=1051.6, ups=1.58, wpb=666.2, bsz=64, num_updates=45200, lr=5.9e-05, gnorm=6.466, clip=100, train_wall=97, wall=18492 (progress_bar.py:260, log())
[2021-03-22 19:04:55] INFO >> epoch 053: 464 / 868 loss=2.345, bleu=0, ppl=5.08, wps=2779, ups=4.16, wpb=667.8, bsz=64, num_updates=45600, lr=5.9e-05, gnorm=6.481, clip=100, train_wall=95, wall=18588 (progress_bar.py:260, log())
[2021-03-22 19:06:34] INFO >> epoch 053: 864 / 868 loss=2.351, bleu=0, ppl=5.1, wps=2705, ups=4.06, wpb=665.9, bsz=64, num_updates=46000, lr=5.9e-05, gnorm=6.52, clip=100, train_wall=97, wall=18686 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 19:06:35] INFO >> epoch 053 | loss 2.349 | bleu 0 | ppl 5.1 | wps 1579.2 | ups 2.37 | wpb 666.6 | bsz 64 | num_updates 46004 | lr 5.9e-05 | gnorm 6.502 | clip 100 | train_wall 208 | wall 18688 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 19:08:01] INFO >> epoch 053 | valid on 'valid' subset | loss 7.47 | bleu 25.0355 | ppl 177.32 | wps 2426.1 | wpb 10165.6 | bsz 973.9 | num_updates 46004 | best_bleu 25.0355 (progress_bar.py:269, print())
[2021-03-22 19:08:54] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 53 @ 46004 updates, score 25.035472) (writing took 52.266318 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 19:10:36] INFO >> epoch 054: 396 / 868 loss=2.301, bleu=0, ppl=4.93, wps=1101.8, ups=1.65, wpb=667.2, bsz=64, num_updates=46400, lr=5.9e-05, gnorm=6.449, clip=100, train_wall=95, wall=18928 (progress_bar.py:260, log())
[2021-03-22 19:12:13] INFO >> epoch 054: 796 / 868 loss=2.293, bleu=0, ppl=4.9, wps=2733.6, ups=4.1, wpb=666.1, bsz=64, num_updates=46800, lr=5.9e-05, gnorm=6.537, clip=100, train_wall=96, wall=19026 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 19:12:32] INFO >> epoch 054 | loss 2.29 | bleu 0 | ppl 4.89 | wps 1623.6 | ups 2.44 | wpb 666.6 | bsz 64 | num_updates 46872 | lr 5.9e-05 | gnorm 6.491 | clip 100 | train_wall 207 | wall 19044 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 19:14:00] INFO >> epoch 054 | valid on 'valid' subset | loss 7.591 | bleu 25.298 | ppl 192.87 | wps 2410.6 | wpb 10165.6 | bsz 973.9 | num_updates 46872 | best_bleu 25.298 (progress_bar.py:269, print())
[2021-03-22 19:14:52] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 54 @ 46872 updates, score 25.297976) (writing took 51.990884 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 19:16:17] INFO >> epoch 055: 328 / 868 loss=2.235, bleu=0, ppl=4.71, wps=1093.4, ups=1.64, wpb=667.3, bsz=64, num_updates=47200, lr=5.8e-05, gnorm=6.452, clip=100, train_wall=95, wall=19270 (progress_bar.py:260, log())
[2021-03-22 19:17:55] INFO >> epoch 055: 728 / 868 loss=2.245, bleu=0, ppl=4.74, wps=2734.2, ups=4.11, wpb=665.4, bsz=64, num_updates=47600, lr=5.8e-05, gnorm=6.53, clip=100, train_wall=96, wall=19367 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 19:18:30] INFO >> epoch 055 | loss 2.236 | bleu 0 | ppl 4.71 | wps 1614.3 | ups 2.42 | wpb 666.6 | bsz 64 | num_updates 47740 | lr 5.8e-05 | gnorm 6.489 | clip 100 | train_wall 208 | wall 19403 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 19:19:58] INFO >> epoch 055 | valid on 'valid' subset | loss 7.537 | bleu 25.5673 | ppl 185.72 | wps 2398.5 | wpb 10165.6 | bsz 973.9 | num_updates 47740 | best_bleu 25.5673 (progress_bar.py:269, print())
[2021-03-22 19:20:50] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 55 @ 47740 updates, score 25.567275) (writing took 52.527813 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 19:22:00] INFO >> epoch 056: 260 / 868 loss=2.196, bleu=0, ppl=4.58, wps=1089.4, ups=1.63, wpb=667.9, bsz=64, num_updates=48000, lr=5.8e-05, gnorm=6.446, clip=100, train_wall=96, wall=19613 (progress_bar.py:260, log())
[2021-03-22 19:23:36] INFO >> epoch 056: 660 / 868 loss=2.187, bleu=0, ppl=4.55, wps=2770.6, ups=4.16, wpb=665.6, bsz=64, num_updates=48400, lr=5.8e-05, gnorm=6.483, clip=100, train_wall=95, wall=19709 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 19:24:28] INFO >> epoch 056 | loss 2.184 | bleu 0 | ppl 4.54 | wps 1616.4 | ups 2.42 | wpb 666.6 | bsz 64 | num_updates 48608 | lr 5.8e-05 | gnorm 6.469 | clip 100 | train_wall 207 | wall 19760 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 19:25:55] INFO >> epoch 056 | valid on 'valid' subset | loss 7.742 | bleu 25.7515 | ppl 214.01 | wps 2424 | wpb 10165.6 | bsz 973.9 | num_updates 48608 | best_bleu 25.7515 (progress_bar.py:269, print())
[2021-03-22 19:26:47] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 56 @ 48608 updates, score 25.751523) (writing took 52.540727 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 19:27:41] INFO >> epoch 057: 192 / 868 loss=2.159, bleu=0, ppl=4.47, wps=1091.5, ups=1.64, wpb=667.2, bsz=64, num_updates=48800, lr=5.7e-05, gnorm=6.454, clip=100, train_wall=96, wall=19953 (progress_bar.py:260, log())
[2021-03-22 19:29:17] INFO >> epoch 057: 592 / 868 loss=2.119, bleu=0, ppl=4.34, wps=2755.5, ups=4.14, wpb=665.5, bsz=64, num_updates=49200, lr=5.7e-05, gnorm=6.492, clip=100, train_wall=95, wall=20050 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 19:30:26] INFO >> epoch 057 | loss 2.126 | bleu 0 | ppl 4.37 | wps 1617.2 | ups 2.43 | wpb 666.6 | bsz 64 | num_updates 49476 | lr 5.7e-05 | gnorm 6.463 | clip 100 | train_wall 208 | wall 20118 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 19:31:51] INFO >> epoch 057 | valid on 'valid' subset | loss 7.749 | bleu 25.8345 | ppl 215.11 | wps 2466.4 | wpb 10165.6 | bsz 973.9 | num_updates 49476 | best_bleu 25.8345 (progress_bar.py:269, print())
[2021-03-22 19:32:44] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 57 @ 49476 updates, score 25.83446) (writing took 52.371716 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 19:33:20] INFO >> epoch 058: 124 / 868 loss=2.126, bleu=0, ppl=4.37, wps=1099, ups=1.64, wpb=668.1, bsz=64, num_updates=49600, lr=5.6e-05, gnorm=6.449, clip=100, train_wall=96, wall=20293 (progress_bar.py:260, log())
[2021-03-22 19:34:56] INFO >> epoch 058: 524 / 868 loss=2.068, bleu=0, ppl=4.19, wps=2780.3, ups=4.18, wpb=665.2, bsz=64, num_updates=50000, lr=5.6e-05, gnorm=6.478, clip=100, train_wall=94, wall=20389 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 19:36:21] INFO >> epoch 058 | loss 2.082 | bleu 0 | ppl 4.23 | wps 1626.4 | ups 2.44 | wpb 666.6 | bsz 64 | num_updates 50344 | lr 5.6e-05 | gnorm 6.465 | clip 100 | train_wall 207 | wall 20474 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 19:37:47] INFO >> epoch 058 | valid on 'valid' subset | loss 7.81 | bleu 25.9641 | ppl 224.35 | wps 2443.6 | wpb 10165.6 | bsz 973.9 | num_updates 50344 | best_bleu 25.9641 (progress_bar.py:269, print())
[2021-03-22 19:38:40] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 58 @ 50344 updates, score 25.964075) (writing took 52.659125 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 19:39:00] INFO >> epoch 059: 56 / 868 loss=2.072, bleu=0, ppl=4.21, wps=1092.9, ups=1.64, wpb=666.9, bsz=64, num_updates=50400, lr=5.6e-05, gnorm=6.452, clip=100, train_wall=97, wall=20633 (progress_bar.py:260, log())
[2021-03-22 19:40:36] INFO >> epoch 059: 456 / 868 loss=2.033, bleu=0, ppl=4.09, wps=2779.9, ups=4.16, wpb=668.1, bsz=64, num_updates=50800, lr=5.6e-05, gnorm=6.498, clip=100, train_wall=95, wall=20729 (progress_bar.py:260, log())
[2021-03-22 19:42:14] INFO >> epoch 059: 856 / 868 loss=2.035, bleu=0, ppl=4.1, wps=2716, ups=4.08, wpb=665.2, bsz=64, num_updates=51200, lr=5.6e-05, gnorm=6.485, clip=100, train_wall=96, wall=20827 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 19:42:18] INFO >> epoch 059 | loss 2.032 | bleu 0 | ppl 4.09 | wps 1622.9 | ups 2.43 | wpb 666.6 | bsz 64 | num_updates 51212 | lr 5.6e-05 | gnorm 6.488 | clip 100 | train_wall 208 | wall 20831 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 19:43:43] INFO >> epoch 059 | valid on 'valid' subset | loss 7.978 | bleu 26.1111 | ppl 252.21 | wps 2468.9 | wpb 10165.6 | bsz 973.9 | num_updates 51212 | best_bleu 26.1111 (progress_bar.py:269, print())
[2021-03-22 19:44:36] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 59 @ 51212 updates, score 26.111097) (writing took 52.796476 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 19:46:16] INFO >> epoch 060: 388 / 868 loss=1.992, bleu=0, ppl=3.98, wps=1103.3, ups=1.65, wpb=667.1, bsz=64, num_updates=51600, lr=5.5e-05, gnorm=6.459, clip=100, train_wall=95, wall=21069 (progress_bar.py:260, log())
[2021-03-22 19:47:53] INFO >> epoch 060: 788 / 868 loss=1.981, bleu=0, ppl=3.95, wps=2740, ups=4.11, wpb=665.9, bsz=64, num_updates=52000, lr=5.5e-05, gnorm=6.479, clip=100, train_wall=96, wall=21166 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 19:48:14] INFO >> epoch 060 | loss 1.984 | bleu 0 | ppl 3.96 | wps 1625 | ups 2.44 | wpb 666.6 | bsz 64 | num_updates 52080 | lr 5.5e-05 | gnorm 6.466 | clip 100 | train_wall 208 | wall 21187 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 19:49:39] INFO >> epoch 060 | valid on 'valid' subset | loss 8 | bleu 26.3204 | ppl 255.99 | wps 2483.1 | wpb 10165.6 | bsz 973.9 | num_updates 52080 | best_bleu 26.3204 (progress_bar.py:269, print())
[2021-03-22 19:50:31] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 60 @ 52080 updates, score 26.320367) (writing took 52.336812 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 19:51:55] INFO >> epoch 061: 320 / 868 loss=1.933, bleu=0, ppl=3.82, wps=1107.2, ups=1.66, wpb=667.6, bsz=64, num_updates=52400, lr=5.5e-05, gnorm=6.427, clip=100, train_wall=95, wall=21407 (progress_bar.py:260, log())
[2021-03-22 19:53:32] INFO >> epoch 061: 720 / 868 loss=1.937, bleu=0, ppl=3.83, wps=2736.8, ups=4.11, wpb=665.4, bsz=64, num_updates=52800, lr=5.5e-05, gnorm=6.469, clip=100, train_wall=96, wall=21504 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 19:54:09] INFO >> epoch 061 | loss 1.93 | bleu 0 | ppl 3.81 | wps 1631.7 | ups 2.45 | wpb 666.6 | bsz 64 | num_updates 52948 | lr 5.5e-05 | gnorm 6.45 | clip 100 | train_wall 207 | wall 21541 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 19:55:34] INFO >> epoch 061 | valid on 'valid' subset | loss 7.995 | bleu 26.1225 | ppl 255.13 | wps 2453.2 | wpb 10165.6 | bsz 973.9 | num_updates 52948 | best_bleu 26.3204 (progress_bar.py:269, print())
[2021-03-22 19:56:02] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_last.pt (epoch 61 @ 52948 updates, score 26.122537) (writing took 27.402706 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 19:57:09] INFO >> epoch 062: 252 / 868 loss=1.902, bleu=0, ppl=3.74, wps=1227.6, ups=1.84, wpb=667.7, bsz=64, num_updates=53200, lr=5.4e-05, gnorm=6.431, clip=100, train_wall=96, wall=21722 (progress_bar.py:260, log())
[2021-03-22 19:58:46] INFO >> epoch 062: 652 / 868 loss=1.903, bleu=0, ppl=3.74, wps=2750.8, ups=4.13, wpb=665.8, bsz=64, num_updates=53600, lr=5.4e-05, gnorm=6.492, clip=100, train_wall=95, wall=21819 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 19:59:40] INFO >> epoch 062 | loss 1.893 | bleu 0 | ppl 3.71 | wps 1747.1 | ups 2.62 | wpb 666.6 | bsz 64 | num_updates 53816 | lr 5.4e-05 | gnorm 6.452 | clip 100 | train_wall 208 | wall 21872 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 20:01:05] INFO >> epoch 062 | valid on 'valid' subset | loss 8.215 | bleu 26.169 | ppl 297.22 | wps 2462.9 | wpb 10165.6 | bsz 973.9 | num_updates 53816 | best_bleu 26.3204 (progress_bar.py:269, print())
[2021-03-22 20:01:33] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_last.pt (epoch 62 @ 53816 updates, score 26.169004) (writing took 27.622413 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 20:02:24] INFO >> epoch 063: 184 / 868 loss=1.871, bleu=0, ppl=3.66, wps=1223.5, ups=1.83, wpb=667.1, bsz=64, num_updates=54000, lr=5.4e-05, gnorm=6.42, clip=100, train_wall=96, wall=22037 (progress_bar.py:260, log())
[2021-03-22 20:04:01] INFO >> epoch 063: 584 / 868 loss=1.84, bleu=0, ppl=3.58, wps=2755.6, ups=4.14, wpb=665.7, bsz=64, num_updates=54400, lr=5.4e-05, gnorm=6.472, clip=100, train_wall=95, wall=22134 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 20:05:11] INFO >> epoch 063 | loss 1.848 | bleu 0 | ppl 3.6 | wps 1746 | ups 2.62 | wpb 666.6 | bsz 64 | num_updates 54684 | lr 5.4e-05 | gnorm 6.449 | clip 100 | train_wall 208 | wall 22204 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 20:06:37] INFO >> epoch 063 | valid on 'valid' subset | loss 8.038 | bleu 26.4962 | ppl 262.76 | wps 2459.2 | wpb 10165.6 | bsz 973.9 | num_updates 54684 | best_bleu 26.4962 (progress_bar.py:269, print())
[2021-03-22 20:07:29] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 63 @ 54684 updates, score 26.496199) (writing took 52.265704 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 20:08:04] INFO >> epoch 064: 116 / 868 loss=1.843, bleu=0, ppl=3.59, wps=1099.4, ups=1.65, wpb=667.2, bsz=64, num_updates=54800, lr=5.3e-05, gnorm=6.446, clip=100, train_wall=96, wall=22376 (progress_bar.py:260, log())
[2021-03-22 20:09:40] INFO >> epoch 064: 516 / 868 loss=1.796, bleu=0, ppl=3.47, wps=2777, ups=4.17, wpb=666.2, bsz=64, num_updates=55200, lr=5.3e-05, gnorm=6.48, clip=100, train_wall=95, wall=22472 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 20:11:07] INFO >> epoch 064 | loss 1.805 | bleu 0 | ppl 3.5 | wps 1626 | ups 2.44 | wpb 666.6 | bsz 64 | num_updates 55552 | lr 5.3e-05 | gnorm 6.466 | clip 100 | train_wall 208 | wall 22560 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 20:12:33] INFO >> epoch 064 | valid on 'valid' subset | loss 8.203 | bleu 26.5549 | ppl 294.76 | wps 2457.9 | wpb 10165.6 | bsz 973.9 | num_updates 55552 | best_bleu 26.5549 (progress_bar.py:269, print())
[2021-03-22 20:13:25] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 64 @ 55552 updates, score 26.55486) (writing took 52.533862 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 20:13:43] INFO >> epoch 065: 48 / 868 loss=1.797, bleu=0, ppl=3.47, wps=1094, ups=1.64, wpb=666.7, bsz=64, num_updates=55600, lr=5.3e-05, gnorm=6.445, clip=100, train_wall=97, wall=22716 (progress_bar.py:260, log())
[2021-03-22 20:15:20] INFO >> epoch 065: 448 / 868 loss=1.771, bleu=0, ppl=3.41, wps=2768.8, ups=4.15, wpb=668, bsz=64, num_updates=56000, lr=5.3e-05, gnorm=6.438, clip=100, train_wall=95, wall=22813 (progress_bar.py:260, log())
[2021-03-22 20:16:58] INFO >> epoch 065: 848 / 868 loss=1.759, bleu=0, ppl=3.38, wps=2717.6, ups=4.08, wpb=665.6, bsz=64, num_updates=56400, lr=5.3e-05, gnorm=6.451, clip=100, train_wall=96, wall=22911 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 20:17:04] INFO >> epoch 065 | loss 1.762 | bleu 0 | ppl 3.39 | wps 1623.2 | ups 2.43 | wpb 666.6 | bsz 64 | num_updates 56420 | lr 5.3e-05 | gnorm 6.44 | clip 100 | train_wall 208 | wall 22916 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 20:18:29] INFO >> epoch 065 | valid on 'valid' subset | loss 8.295 | bleu 26.3747 | ppl 314.16 | wps 2457.8 | wpb 10165.6 | bsz 973.9 | num_updates 56420 | best_bleu 26.5549 (progress_bar.py:269, print())
[2021-03-22 20:18:57] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_last.pt (epoch 65 @ 56420 updates, score 26.374678) (writing took 27.449558 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 20:20:35] INFO >> epoch 066: 380 / 868 loss=1.729, bleu=0, ppl=3.31, wps=1228.2, ups=1.84, wpb=667.5, bsz=64, num_updates=56800, lr=5.2e-05, gnorm=6.417, clip=100, train_wall=95, wall=23128 (progress_bar.py:260, log())
[2021-03-22 20:22:13] INFO >> epoch 066: 780 / 868 loss=1.724, bleu=0, ppl=3.3, wps=2738.4, ups=4.11, wpb=665.5, bsz=64, num_updates=57200, lr=5.2e-05, gnorm=6.427, clip=100, train_wall=96, wall=23225 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 20:22:35] INFO >> epoch 066 | loss 1.723 | bleu 0 | ppl 3.3 | wps 1746.7 | ups 2.62 | wpb 666.6 | bsz 64 | num_updates 57288 | lr 5.2e-05 | gnorm 6.428 | clip 100 | train_wall 208 | wall 23247 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 20:24:00] INFO >> epoch 066 | valid on 'valid' subset | loss 8.363 | bleu 26.7818 | ppl 329.22 | wps 2468.5 | wpb 10165.6 | bsz 973.9 | num_updates 57288 | best_bleu 26.7818 (progress_bar.py:269, print())
[2021-03-22 20:24:53] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 66 @ 57288 updates, score 26.781823) (writing took 52.775477 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 20:26:15] INFO >> epoch 067: 312 / 868 loss=1.684, bleu=0, ppl=3.21, wps=1102.3, ups=1.65, wpb=667.4, bsz=64, num_updates=57600, lr=5.2e-05, gnorm=6.417, clip=100, train_wall=96, wall=23467 (progress_bar.py:260, log())
[2021-03-22 20:27:52] INFO >> epoch 067: 712 / 868 loss=1.693, bleu=0, ppl=3.23, wps=2733.7, ups=4.11, wpb=665.3, bsz=64, num_updates=58000, lr=5.2e-05, gnorm=6.476, clip=100, train_wall=96, wall=23565 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 20:28:31] INFO >> epoch 067 | loss 1.687 | bleu 0 | ppl 3.22 | wps 1625.8 | ups 2.44 | wpb 666.6 | bsz 64 | num_updates 58156 | lr 5.2e-05 | gnorm 6.438 | clip 100 | train_wall 208 | wall 23603 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 20:29:56] INFO >> epoch 067 | valid on 'valid' subset | loss 8.358 | bleu 26.7503 | ppl 328.13 | wps 2456.6 | wpb 10165.6 | bsz 973.9 | num_updates 58156 | best_bleu 26.7818 (progress_bar.py:269, print())
[2021-03-22 20:30:24] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_last.pt (epoch 67 @ 58156 updates, score 26.750337) (writing took 27.475486 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 20:31:30] INFO >> epoch 068: 244 / 868 loss=1.659, bleu=0, ppl=3.16, wps=1228.5, ups=1.84, wpb=668, bsz=64, num_updates=58400, lr=5.1e-05, gnorm=6.371, clip=100, train_wall=96, wall=23782 (progress_bar.py:260, log())
[2021-03-22 20:33:06] INFO >> epoch 068: 644 / 868 loss=1.653, bleu=0, ppl=3.15, wps=2751.1, ups=4.13, wpb=665.4, bsz=64, num_updates=58800, lr=5.1e-05, gnorm=6.445, clip=100, train_wall=95, wall=23879 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 20:34:02] INFO >> epoch 068 | loss 1.649 | bleu 0 | ppl 3.14 | wps 1747.2 | ups 2.62 | wpb 666.6 | bsz 64 | num_updates 59024 | lr 5.1e-05 | gnorm 6.405 | clip 100 | train_wall 208 | wall 23935 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 20:35:27] INFO >> epoch 068 | valid on 'valid' subset | loss 8.425 | bleu 26.9712 | ppl 343.66 | wps 2448.3 | wpb 10165.6 | bsz 973.9 | num_updates 59024 | best_bleu 26.9712 (progress_bar.py:269, print())
[2021-03-22 20:36:20] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 68 @ 59024 updates, score 26.971163) (writing took 52.758600 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 20:37:09] INFO >> epoch 069: 176 / 868 loss=1.637, bleu=0, ppl=3.11, wps=1098.7, ups=1.65, wpb=667.6, bsz=64, num_updates=59200, lr=5e-05, gnorm=6.396, clip=100, train_wall=96, wall=24122 (progress_bar.py:260, log())
[2021-03-22 20:38:46] INFO >> epoch 069: 576 / 868 loss=1.596, bleu=0, ppl=3.02, wps=2756.3, ups=4.14, wpb=665.6, bsz=64, num_updates=59600, lr=5e-05, gnorm=6.422, clip=100, train_wall=95, wall=24219 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 20:39:59] INFO >> epoch 069 | loss 1.607 | bleu 0 | ppl 3.05 | wps 1622.7 | ups 2.43 | wpb 666.6 | bsz 64 | num_updates 59892 | lr 5e-05 | gnorm 6.407 | clip 100 | train_wall 208 | wall 24291 (progress_bar.py:269, print())
Using backend: pytorch
Using backend: pytorch
Using backend: pytorch
[2021-03-22 20:41:24] INFO >> epoch 069 | valid on 'valid' subset | loss 8.509 | bleu 26.9954 | ppl 364.28 | wps 2451.7 | wpb 10165.6 | bsz 973.9 | num_updates 59892 | best_bleu 26.9954 (progress_bar.py:269, print())
[2021-03-22 20:42:17] INFO >> saved checkpoint /mnt/wanyao/.ncc/python_wan/summarization/data-mmap/transformer/checkpoints/checkpoint_best.pt (epoch 69 @ 59892 updates, score 26.995372) (writing took 52.637872 seconds) (checkpoint_utils.py:79, save_checkpoint())
[2021-03-22 20:42:50] INFO >> epoch 070: 108 / 868 loss=1.606, bleu=0, ppl=3.04, wps=1095, ups=1.64, wpb=667, bsz=64, num_updates=60000, lr=5e-05, gnorm=6.401, clip=100, train_wall=97, wall=24462 (progress_bar.py:260, log())
[2021-03-22 20:44:26] INFO >> epoch 070: 508 / 868 loss=1.557, bleu=0, ppl=2.94, wps=2774.6, ups=4.16, wpb=666.4, bsz=64, num_updates=60400, lr=5e-05, gnorm=6.357, clip=100, train_wall=95, wall=24558 (progress_bar.py:260, log())
Using backend: pytorch
Using backend: pytorch