-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathtraining.out
192 lines (173 loc) · 16.4 KB
/
training.out
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
2024-04-17 13:49:08.637839: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-04-17 13:49:08.637949: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-04-17 13:49:08.639067: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-04-17 13:49:08.645047: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-04-17 13:49:09.759214: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-04-17 13:49:10.793319: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-04-17 13:49:10.828183: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-04-17 13:49:10.828453: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-04-17 13:49:12.923910: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-04-17 13:49:12.924177: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-04-17 13:49:12.924372: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-04-17 13:49:13.005885: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-04-17 13:49:13.006107: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-04-17 13:49:13.006299: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-04-17 13:49:13.006457: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1929] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 5484 MB memory: -> device: 0, name: NVIDIA GeForce GTX 1060 6GB, pci bus id: 0000:01:00.0, compute capability: 6.1
Num GPUs Available: 1
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 32, 32, 16) 1216
batch_normalization (Batch (None, 32, 32, 16) 64
Normalization)
activation (Activation) (None, 32, 32, 16) 0
max_pooling2d (MaxPooling2 (None, 16, 16, 16) 0
D)
linear_bottleneck_block (L (None, 16, 16, 32) 1344
inearBottleneckBlock)
linear_bottleneck_block_1 (None, 16, 16, 64) 3712
(LinearBottleneckBlock)
max_pooling2d_1 (MaxPoolin (None, 8, 8, 64) 0
g2D)
linear_bottleneck_block_2 (None, 8, 8, 64) 6208
(LinearBottleneckBlock)
linear_bottleneck_block_3 (None, 8, 8, 128) 11520
(LinearBottleneckBlock)
global_average_pooling2d ( (None, 128) 0
GlobalAveragePooling2D)
dropout (Dropout) (None, 128) 0
dense (Dense) (None, 10) 1290
=================================================================
Total params: 25354 (99.04 KB)
Trainable params: 23818 (93.04 KB)
Non-trainable params: 1536 (6.00 KB)
_________________________________________________________________
2024-04-17 13:49:14.034388: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 614400000 exceeds 10% of free system memory.
2024-04-17 13:49:14.472942: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 614400000 exceeds 10% of free system memory.
Epoch 1/100
2024-04-17 13:49:18.034108: I external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:454] Loaded cuDNN version 8904
2024-04-17 13:49:19.247980: I external/local_xla/xla/service/service.cc:168] XLA service 0x7f05e777b640 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-04-17 13:49:19.248018: I external/local_xla/xla/service/service.cc:176] StreamExecutor device (0): NVIDIA GeForce GTX 1060 6GB, Compute Capability 6.1
2024-04-17 13:49:19.254041: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1713358159.348367 282002 device_compiler.h:186] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process.
1563/1563 - 37s - loss: 1.4910 - accuracy: 0.4697 - val_loss: 1.7780 - val_accuracy: 0.4840 - lr: 0.0100 - 37s/epoch - 24ms/step
Epoch 2/100
1563/1563 - 33s - loss: 1.2131 - accuracy: 0.5811 - val_loss: 1.2344 - val_accuracy: 0.5918 - lr: 0.0100 - 33s/epoch - 21ms/step
Epoch 3/100
1563/1563 - 32s - loss: 1.1127 - accuracy: 0.6210 - val_loss: 1.7672 - val_accuracy: 0.4822 - lr: 0.0100 - 32s/epoch - 21ms/step
Epoch 4/100
1563/1563 - 33s - loss: 1.0438 - accuracy: 0.6466 - val_loss: 1.2509 - val_accuracy: 0.6006 - lr: 0.0100 - 33s/epoch - 21ms/step
Epoch 5/100
1563/1563 - 33s - loss: 0.9853 - accuracy: 0.6709 - val_loss: 1.0542 - val_accuracy: 0.6473 - lr: 0.0100 - 33s/epoch - 21ms/step
Epoch 6/100
1563/1563 - 31s - loss: 0.9441 - accuracy: 0.6865 - val_loss: 1.0716 - val_accuracy: 0.6510 - lr: 0.0100 - 31s/epoch - 20ms/step
Epoch 7/100
1563/1563 - 35s - loss: 0.9122 - accuracy: 0.6973 - val_loss: 1.1258 - val_accuracy: 0.6206 - lr: 0.0100 - 35s/epoch - 23ms/step
Epoch 8/100
1563/1563 - 31s - loss: 0.8943 - accuracy: 0.7054 - val_loss: 0.9702 - val_accuracy: 0.6698 - lr: 0.0100 - 31s/epoch - 20ms/step
Epoch 9/100
1563/1563 - 34s - loss: 0.8746 - accuracy: 0.7129 - val_loss: 0.8716 - val_accuracy: 0.7123 - lr: 0.0100 - 34s/epoch - 22ms/step
Epoch 10/100
1563/1563 - 34s - loss: 0.8519 - accuracy: 0.7188 - val_loss: 0.9478 - val_accuracy: 0.6938 - lr: 0.0100 - 34s/epoch - 22ms/step
Epoch 11/100
1563/1563 - 33s - loss: 0.8316 - accuracy: 0.7280 - val_loss: 0.8544 - val_accuracy: 0.7303 - lr: 0.0100 - 33s/epoch - 21ms/step
Epoch 12/100
1563/1563 - 28s - loss: 0.8224 - accuracy: 0.7306 - val_loss: 0.9477 - val_accuracy: 0.6885 - lr: 0.0100 - 28s/epoch - 18ms/step
Epoch 13/100
1563/1563 - 30s - loss: 0.8057 - accuracy: 0.7356 - val_loss: 1.1028 - val_accuracy: 0.6514 - lr: 0.0100 - 30s/epoch - 20ms/step
Epoch 14/100
1563/1563 - 32s - loss: 0.7994 - accuracy: 0.7386 - val_loss: 0.8261 - val_accuracy: 0.7288 - lr: 0.0100 - 32s/epoch - 20ms/step
Epoch 15/100
1563/1563 - 31s - loss: 0.7870 - accuracy: 0.7445 - val_loss: 0.7969 - val_accuracy: 0.7393 - lr: 0.0100 - 31s/epoch - 20ms/step
Epoch 16/100
1563/1563 - 32s - loss: 0.7791 - accuracy: 0.7459 - val_loss: 0.7753 - val_accuracy: 0.7517 - lr: 0.0100 - 32s/epoch - 21ms/step
Epoch 17/100
1563/1563 - 33s - loss: 0.7639 - accuracy: 0.7503 - val_loss: 0.8532 - val_accuracy: 0.7211 - lr: 0.0100 - 33s/epoch - 21ms/step
Epoch 18/100
1563/1563 - 32s - loss: 0.7581 - accuracy: 0.7520 - val_loss: 0.7961 - val_accuracy: 0.7451 - lr: 0.0100 - 32s/epoch - 20ms/step
Epoch 19/100
1563/1563 - 31s - loss: 0.7489 - accuracy: 0.7597 - val_loss: 0.8495 - val_accuracy: 0.7228 - lr: 0.0100 - 31s/epoch - 20ms/step
Epoch 20/100
1563/1563 - 31s - loss: 0.7414 - accuracy: 0.7608 - val_loss: 0.8612 - val_accuracy: 0.7247 - lr: 0.0100 - 31s/epoch - 20ms/step
Epoch 21/100
1563/1563 - 33s - loss: 0.7391 - accuracy: 0.7624 - val_loss: 0.7618 - val_accuracy: 0.7551 - lr: 0.0100 - 33s/epoch - 21ms/step
Epoch 22/100
1563/1563 - 32s - loss: 0.7312 - accuracy: 0.7623 - val_loss: 0.8004 - val_accuracy: 0.7414 - lr: 0.0100 - 32s/epoch - 20ms/step
Epoch 23/100
1563/1563 - 30s - loss: 0.7271 - accuracy: 0.7647 - val_loss: 0.8045 - val_accuracy: 0.7387 - lr: 0.0100 - 30s/epoch - 19ms/step
Epoch 24/100
1563/1563 - 25s - loss: 0.7171 - accuracy: 0.7684 - val_loss: 0.7818 - val_accuracy: 0.7483 - lr: 0.0100 - 25s/epoch - 16ms/step
Epoch 25/100
1563/1563 - 25s - loss: 0.7157 - accuracy: 0.7710 - val_loss: 0.7982 - val_accuracy: 0.7416 - lr: 0.0100 - 25s/epoch - 16ms/step
Epoch 26/100
Epoch 26: ReduceLROnPlateau reducing learning rate to 0.0009999999776482583.
1563/1563 - 26s - loss: 0.7067 - accuracy: 0.7728 - val_loss: 0.9440 - val_accuracy: 0.6980 - lr: 0.0100 - 26s/epoch - 17ms/step
Epoch 27/100
1563/1563 - 26s - loss: 0.6160 - accuracy: 0.8025 - val_loss: 0.6560 - val_accuracy: 0.7884 - lr: 1.0000e-03 - 26s/epoch - 17ms/step
Epoch 28/100
1563/1563 - 24s - loss: 0.5877 - accuracy: 0.8107 - val_loss: 0.6421 - val_accuracy: 0.7953 - lr: 1.0000e-03 - 24s/epoch - 16ms/step
Epoch 29/100
1563/1563 - 24s - loss: 0.5792 - accuracy: 0.8125 - val_loss: 0.6277 - val_accuracy: 0.7980 - lr: 1.0000e-03 - 24s/epoch - 15ms/step
Epoch 30/100
1563/1563 - 23s - loss: 0.5702 - accuracy: 0.8138 - val_loss: 0.6333 - val_accuracy: 0.7949 - lr: 1.0000e-03 - 23s/epoch - 15ms/step
Epoch 31/100
1563/1563 - 23s - loss: 0.5629 - accuracy: 0.8174 - val_loss: 0.6281 - val_accuracy: 0.7985 - lr: 1.0000e-03 - 23s/epoch - 15ms/step
Epoch 32/100
1563/1563 - 25s - loss: 0.5627 - accuracy: 0.8180 - val_loss: 0.6270 - val_accuracy: 0.7968 - lr: 1.0000e-03 - 25s/epoch - 16ms/step
Epoch 33/100
1563/1563 - 28s - loss: 0.5545 - accuracy: 0.8196 - val_loss: 0.6246 - val_accuracy: 0.7953 - lr: 1.0000e-03 - 28s/epoch - 18ms/step
Epoch 34/100
Epoch 34: ReduceLROnPlateau reducing learning rate to 9.999999310821295e-05.
1563/1563 - 27s - loss: 0.5518 - accuracy: 0.8192 - val_loss: 0.6243 - val_accuracy: 0.7950 - lr: 1.0000e-03 - 27s/epoch - 17ms/step
Epoch 35/100
1563/1563 - 25s - loss: 0.5412 - accuracy: 0.8231 - val_loss: 0.6172 - val_accuracy: 0.7987 - lr: 1.0000e-04 - 25s/epoch - 16ms/step
Epoch 36/100
1563/1563 - 27s - loss: 0.5342 - accuracy: 0.8254 - val_loss: 0.6172 - val_accuracy: 0.7991 - lr: 1.0000e-04 - 27s/epoch - 17ms/step
Epoch 37/100
1563/1563 - 26s - loss: 0.5378 - accuracy: 0.8246 - val_loss: 0.6176 - val_accuracy: 0.7978 - lr: 1.0000e-04 - 26s/epoch - 17ms/step
Epoch 38/100
1563/1563 - 27s - loss: 0.5365 - accuracy: 0.8239 - val_loss: 0.6164 - val_accuracy: 0.7983 - lr: 1.0000e-04 - 27s/epoch - 17ms/step
Epoch 39/100
1563/1563 - 32s - loss: 0.5349 - accuracy: 0.8240 - val_loss: 0.6165 - val_accuracy: 0.7995 - lr: 1.0000e-04 - 32s/epoch - 20ms/step
Epoch 40/100
1563/1563 - 30s - loss: 0.5306 - accuracy: 0.8261 - val_loss: 0.6169 - val_accuracy: 0.7992 - lr: 1.0000e-04 - 30s/epoch - 19ms/step
Epoch 41/100
Epoch 41: ReduceLROnPlateau reducing learning rate to 9.999999019782991e-06.
1563/1563 - 33s - loss: 0.5371 - accuracy: 0.8229 - val_loss: 0.6170 - val_accuracy: 0.7992 - lr: 1.0000e-04 - 33s/epoch - 21ms/step
Epoch 42/100
1563/1563 - 34s - loss: 0.5363 - accuracy: 0.8260 - val_loss: 0.6166 - val_accuracy: 0.7993 - lr: 1.0000e-05 - 34s/epoch - 21ms/step
Epoch 43/100
1563/1563 - 34s - loss: 0.5328 - accuracy: 0.8239 - val_loss: 0.6165 - val_accuracy: 0.7988 - lr: 1.0000e-05 - 34s/epoch - 22ms/step
Epoch 44/100
1563/1563 - 33s - loss: 0.5345 - accuracy: 0.8249 - val_loss: 0.6173 - val_accuracy: 0.7994 - lr: 1.0000e-05 - 33s/epoch - 21ms/step
Epoch 44: early stopping
2024-04-17 14:11:15.042814: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 1228800000 exceeds 10% of free system memory.
2024-04-17 14:11:15.828609: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 1228800000 exceeds 10% of free system memory.
2024-04-17 14:11:17.921039: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:378] Ignored output_format.
2024-04-17 14:11:17.921079: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:381] Ignored drop_control_dependency.
2024-04-17 14:11:17.921432: I tensorflow/cc/saved_model/reader.cc:83] Reading SavedModel from: cifar_classifier
2024-04-17 14:11:17.929337: I tensorflow/cc/saved_model/reader.cc:51] Reading meta graph with tags { serve }
2024-04-17 14:11:17.929378: I tensorflow/cc/saved_model/reader.cc:146] Reading SavedModel debug info (if present) from: cifar_classifier
2024-04-17 14:11:17.943241: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:388] MLIR V1 optimization pass is not enabled
2024-04-17 14:11:17.950320: I tensorflow/cc/saved_model/loader.cc:233] Restoring SavedModel bundle.
2024-04-17 14:11:18.203954: I tensorflow/cc/saved_model/loader.cc:217] Running initialization op on SavedModel bundle at path: cifar_classifier
2024-04-17 14:11:18.281984: I tensorflow/cc/saved_model/loader.cc:316] SavedModel load for tags { serve }; Status: success: OK. Took 360554 microseconds.
Summary on the non-converted ops:
---------------------------------
* Accepted dialects: tfl, builtin, func
* Non-Converted Ops: 29, Total Ops 49, % non-converted = 59.18 %
* 29 ARITH ops
- arith.constant: 29 occurrences (f32: 28, i32: 1)
(f32: 5)
(f32: 8)
(f32: 1)
(f32: 2)
(f32: 1)
2024-04-17 14:11:18.539922: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 1228800000 exceeds 10% of free system memory.
fully_quantize: 0, inference_type: 6, input_inference_type: INT8, output_inference_type: INT8
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
60.3515625
Accuracy: 0.7647