Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to ICU tokenizer #939

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Use ICU system package

d585a63
Select commit
Loading
Failed to load commit list.
Open

Switch to ICU tokenizer #939

Use ICU system package
d585a63
Select commit
Loading
Failed to load commit list.
firefoxci-taskcluster / finetune-student-ru-en succeeded Nov 23, 2024 in 2h 27m 22s

FirefoxCI (pull_request)

finetune student for ru-en

Details

View task in Taskcluster | View logs in Taskcluster | View task group in Taskcluster

Task Status

Started: 2024-11-23T00:46:33.011Z
Resolved: 2024-11-23T01:17:54.798Z
Task Execution Time: 31 minutes, 21 seconds, 787 milliseconds
Task Status: completed
Reason Resolved: completed
RunId: 0

Artifacts

- public/build/config.opustrainer.yml
- public/build/config.opustrainer.yml.state
- public/build/devset.out
- public/build/final.model.npz.best-chrf.npz
- public/build/final.model.npz.best-chrf.npz.decoder.yml
- public/build/model.npz
- public/build/model.npz.best-bleu-detok.npz
- public/build/model.npz.best-bleu-detok.npz.decoder.yml
- public/build/model.npz.best-ce-mean-words.npz
- public/build/model.npz.best-ce-mean-words.npz.decoder.yml
- public/build/model.npz.best-chrf.npz
- public/build/model.npz.best-chrf.npz.decoder.yml
- public/build/model.npz.decoder.yml
- public/build/model.npz.optimizer.npz
- public/build/model.npz.progress.yml
- public/build/model.npz.yml
- public/build/opustrainer.log
- public/build/train.log
- public/build/valid.log
- public/build/vocab.spm
- public/logs/live_backing.log
- public/logs/live.log


[taskcluster 2024-11-23T00:46:32.806Z] Worker Type (translations-1/b-linux-v100-gpu-4-2tb) settings:
[taskcluster 2024-11-23T00:46:32.806Z]   {
[taskcluster 2024-11-23T00:46:32.806Z]     "config": {
[taskcluster 2024-11-23T00:46:32.806Z]       "deploymentId": ""
[taskcluster 2024-11-23T00:46:32.806Z]     },
[taskcluster 2024-11-23T00:46:32.806Z]     "generic-worker": {
[taskcluster 2024-11-23T00:46:32.806Z]       "engine": "insecure",
[taskcluster 2024-11-23T00:46:32.806Z]       "go-arch": "amd64",
[taskcluster 2024-11-23T00:46:32.806Z]       "go-os": "linux",
[taskcluster 2024-11-23T00:46:32.806Z]       "go-version": "go1.22.2",
[taskcluster 2024-11-23T00:46:32.806Z]       "release": "https://github.com/taskcluster/taskcluster/releases/tag/v64.2.6",
[taskcluster 2024-11-23T00:46:32.806Z]       "revision": "edab196d7d030a5d625b77335109cd9060ab7e1f",
[taskcluster 2024-11-23T00:46:32.806Z]       "source": "https://github.com/taskcluster/taskcluster/commits/edab196d7d030a5d625b77335109cd9060ab7e1f",
[taskcluster 2024-11-23T00:46:32.806Z]       "version": "64.2.6"
[taskcluster 2024-11-23T00:46:32.806Z]     },
[taskcluster 2024-11-23T00:46:32.806Z]     "image": "projects/taskcluster-imaging/global/images/gw-translations-gcp-googlecompute-2024-04-22t18-22-42z",
[taskcluster 2024-11-23T00:46:32.806Z]     "instance-id": "1602537126860987302",
[taskcluster 2024-11-23T00:46:32.806Z]     "instance-type": "projects/887720501152/machineTypes/custom-40-262144",
[taskcluster 2024-11-23T00:46:32.806Z]     "local-ipv4": "10.128.0.42",
[taskcluster 2024-11-23T00:46:32.806Z]     "project-id": "fxci-production-level1-workers",

...(5804 lines hidden)...

[task 2024-11-23T01:17:49.074Z] [tag] Reading original for epoch 1191
[task 2024-11-23T01:17:49.074Z] 
[task 2024-11-23T01:17:49.074Z] [tag] Reading original for epoch 1192
[task 2024-11-23T01:17:49.074Z] 
[task 2024-11-23T01:17:49.074Z] [tag] Reading original for epoch 1193
[task 2024-11-23T01:17:49.074Z] 
[task 2024-11-23T01:17:49.074Z] [tag] Reading original for epoch 1194
[task 2024-11-23T01:17:49.074Z] 
[task 2024-11-23T01:17:49.074Z] [tag] Reading original for epoch 1195
[task 2024-11-23T01:17:49.074Z] 
[task 2024-11-23T01:17:49.074Z] [tag] Reading original for epoch 1196
[task 2024-11-23T01:17:49.074Z] 
[task 2024-11-23T01:17:49.074Z] [tag] Reading original for epoch 1197
[task 2024-11-23T01:17:49.074Z] 
[task 2024-11-23T01:17:49.074Z] [tag] Skipping line because of exception: ValueError('Out-of-bound alignment pairs')
[task 2024-11-23T01:17:49.074Z] 
[task 2024-11-23T01:17:49.074Z] [tag] Skipping line because of exception: ValueError('Out-of-bound alignment pairs')
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1198
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1199
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1200
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1201
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1202
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1203
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1204
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1205
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1206
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1207
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1208
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1209
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1210
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1211
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1212
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1213
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1214
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1215
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1216
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1217
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Skipping line because of exception: ValueError('Out-of-bound alignment pairs')
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Skipping line because of exception: ValueError('Out-of-bound alignment pairs')
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1218
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1219
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1220
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1221
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1222
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1223
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1224
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1225
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1226
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1227
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1228
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1229
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1230
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1231
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1232
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1233
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1234
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1235
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1236
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1237
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1238
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1239
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1240
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1241
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1242
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1243
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1244
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1245
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1246
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1247
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1248
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1249
[task 2024-11-23T01:17:49.075Z] 
[task 2024-11-23T01:17:49.075Z] [tag] Reading original for epoch 1250
[task 2024-11-23T01:17:49.076Z] 
[task 2024-11-23T01:17:49.076Z] [tag] Reading original for epoch 1251
[task 2024-11-23T01:17:49.076Z] 
[task 2024-11-23T01:17:49.076Z] [tag] Reading original for epoch 1252
[task 2024-11-23T01:17:49.076Z] 
[task 2024-11-23T01:17:49.076Z] [tag] Reading original for epoch 1253
[task 2024-11-23T01:17:49.076Z] 
[task 2024-11-23T01:17:49.076Z] [tag] Reading original for epoch 1254
[task 2024-11-23T01:17:49.076Z] 
[task 2024-11-23T01:17:49.076Z] [tag] Reading original for epoch 1255
[task 2024-11-23T01:17:49.076Z] 
[task 2024-11-23T01:17:49.076Z] [tag] Reading original for epoch 1256
[task 2024-11-23T01:17:49.076Z] 
[task 2024-11-23T01:17:49.076Z] [tag] Reading original for epoch 1257
[task 2024-11-23T01:17:49.076Z] 
[task 2024-11-23T01:17:49.076Z] [tag] Reading original for epoch 1258
[task 2024-11-23T01:17:49.076Z] 
[task 2024-11-23T01:17:49.076Z] [tag] Reading original for epoch 1259
[task 2024-11-23T01:17:49.076Z] 
[task 2024-11-23T01:17:49.076Z] [tag] Reading original for epoch 1260
[task 2024-11-23T01:17:49.076Z] 
[task 2024-11-23T01:17:49.076Z] [tag] Reading original for epoch 1261
[task 2024-11-23T01:17:49.076Z] 
[task 2024-11-23T01:17:49.076Z] [tag] Reading original for epoch 1262
[task 2024-11-23T01:17:49.076Z] 
[task 2024-11-23T01:17:49.076Z] [tag] Reading original for epoch 1263
[task 2024-11-23T01:17:49.076Z] 
[task 2024-11-23T01:17:49.076Z] [tag] Reading original for epoch 1264
[task 2024-11-23T01:17:49.076Z] 
[task 2024-11-23T01:17:49.076Z] [tag] Reading original for epoch 1265
[task 2024-11-23T01:17:49.076Z] 
[task 2024-11-23T01:17:49.076Z] [tag] Reading original for epoch 1266
[task 2024-11-23T01:17:49.076Z] 
[task 2024-11-23T01:17:49.076Z] [tag] Reading original for epoch 1267
[task 2024-11-23T01:17:49.076Z] 
[task 2024-11-23T01:17:49.076Z] [tag] Reading original for epoch 1268
[task 2024-11-23T01:17:49.076Z] 
[task 2024-11-23T01:17:49.076Z] [tag] Reading original for epoch 1269
[task 2024-11-23T01:17:49.076Z] 
[task 2024-11-23T01:17:49.076Z] [tag] trainer stopped reading input
[task 2024-11-23T01:17:49.076Z] 
[fetches 2024-11-23T01:17:51.258Z] removing /home/ubuntu/tasks/task_173232279266462/fetches
[fetches 2024-11-23T01:17:51.626Z] finished
[taskcluster 2024-11-23T01:17:51.637Z]    Exit Code: 0
[taskcluster 2024-11-23T01:17:51.637Z]    User Time: 1h52m52.330821s
[taskcluster 2024-11-23T01:17:51.637Z]  Kernel Time: 2m26.401632s
[taskcluster 2024-11-23T01:17:51.637Z]    Wall Time: 31m17.905560262s
[taskcluster 2024-11-23T01:17:51.637Z]       Result: SUCCEEDED
[taskcluster 2024-11-23T01:17:51.638Z] === Task Finished ===
[taskcluster 2024-11-23T01:17:51.638Z] Task Duration: 31m17.907644773s
[taskcluster 2024-11-23T01:17:51.758Z] Uploading artifact public/build/model.npz.yml from file /home/ubuntu/tasks/task_173232279266462/artifacts/model.npz.yml with content encoding "gzip", mime type "application/x-yaml" and expiry 2025-11-17T22:50:31.650Z
[taskcluster 2024-11-23T01:17:51.765Z] Uploading artifact public/build/model.npz from file /home/ubuntu/tasks/task_173232279266462/artifacts/model.npz with content encoding "identity", mime type "application/octet-stream" and expiry 2025-11-17T22:50:31.650Z
[taskcluster 2024-11-23T01:17:51.765Z] Uploading artifact public/build/model.npz.best-ce-mean-words.npz from file /home/ubuntu/tasks/task_173232279266462/artifacts/model.npz.best-ce-mean-words.npz with content encoding "identity", mime type "application/octet-stream" and expiry 2025-11-17T22:50:31.650Z
[taskcluster 2024-11-23T01:17:51.770Z] Uploading artifact public/build/model.npz.decoder.yml from file /home/ubuntu/tasks/task_173232279266462/artifacts/model.npz.decoder.yml with content encoding "gzip", mime type "application/x-yaml" and expiry 2025-11-17T22:50:31.650Z
[taskcluster 2024-11-23T01:17:51.770Z] Uploading artifact public/build/devset.out from file /home/ubuntu/tasks/task_173232279266462/artifacts/devset.out with content encoding "gzip", mime type "application/octet-stream" and expiry 2025-11-17T22:50:31.650Z
[taskcluster 2024-11-23T01:17:51.770Z] Uploading artifact public/build/model.npz.best-chrf.npz from file /home/ubuntu/tasks/task_173232279266462/artifacts/model.npz.best-chrf.npz with content encoding "identity", mime type "application/octet-stream" and expiry 2025-11-17T22:50:31.650Z
[taskcluster 2024-11-23T01:17:51.770Z] Uploading artifact public/build/model.npz.best-bleu-detok.npz from file /home/ubuntu/tasks/task_173232279266462/artifacts/model.npz.best-bleu-detok.npz with content encoding "identity", mime type "application/octet-stream" and expiry 2025-11-17T22:50:31.650Z
[taskcluster 2024-11-23T01:17:51.771Z] Uploading artifact public/build/opustrainer.log from file /home/ubuntu/tasks/task_173232279266462/artifacts/opustrainer.log with content encoding "gzip", mime type "text/plain" and expiry 2025-11-17T22:50:31.650Z
[taskcluster 2024-11-23T01:17:51.772Z] Uploading artifact public/build/model.npz.progress.yml from file /home/ubuntu/tasks/task_173232279266462/artifacts/model.npz.progress.yml with content encoding "gzip", mime type "application/x-yaml" and expiry 2025-11-17T22:50:31.650Z
[taskcluster 2024-11-23T01:17:51.775Z] Uploading artifact public/build/valid.log from file /home/ubuntu/tasks/task_173232279266462/artifacts/valid.log with content encoding "gzip", mime type "text/plain" and expiry 2025-11-17T22:50:31.650Z
[taskcluster 2024-11-23T01:17:51.775Z] Uploading artifact public/build/train.log from file /home/ubuntu/tasks/task_173232279266462/artifacts/train.log with content encoding "gzip", mime type "text/plain" and expiry 2025-11-17T22:50:31.650Z
[taskcluster 2024-11-23T01:17:51.778Z] Uploading artifact public/build/model.npz.best-bleu-detok.npz.decoder.yml from file /home/ubuntu/tasks/task_173232279266462/artifacts/model.npz.best-bleu-detok.npz.decoder.yml with content encoding "gzip", mime type "application/x-yaml" and expiry 2025-11-17T22:50:31.650Z
[taskcluster 2024-11-23T01:17:51.779Z] Uploading artifact public/build/model.npz.best-chrf.npz.decoder.yml from file /home/ubuntu/tasks/task_173232279266462/artifacts/model.npz.best-chrf.npz.decoder.yml with content encoding "gzip", mime type "application/x-yaml" and expiry 2025-11-17T22:50:31.650Z
[taskcluster 2024-11-23T01:17:51.786Z] Uploading artifact public/build/config.opustrainer.yml.state from file /home/ubuntu/tasks/task_173232279266462/artifacts/config.opustrainer.yml.state with content encoding "gzip", mime type "application/octet-stream" and expiry 2025-11-17T22:50:31.650Z
[taskcluster 2024-11-23T01:17:51.786Z] Uploading artifact public/build/final.model.npz.best-chrf.npz.decoder.yml from file /home/ubuntu/tasks/task_173232279266462/artifacts/final.model.npz.best-chrf.npz.decoder.yml with content encoding "gzip", mime type "application/x-yaml" and expiry 2025-11-17T22:50:31.650Z
[taskcluster 2024-11-23T01:17:51.786Z] Uploading artifact public/build/final.model.npz.best-chrf.npz from file /home/ubuntu/tasks/task_173232279266462/artifacts/final.model.npz.best-chrf.npz with content encoding "identity", mime type "application/octet-stream" and expiry 2025-11-17T22:50:31.650Z
[taskcluster 2024-11-23T01:17:51.786Z] Uploading artifact public/build/config.opustrainer.yml from file /home/ubuntu/tasks/task_173232279266462/artifacts/config.opustrainer.yml with content encoding "gzip", mime type "application/x-yaml" and expiry 2025-11-17T22:50:31.650Z
[taskcluster 2024-11-23T01:17:51.786Z] Uploading artifact public/build/model.npz.best-ce-mean-words.npz.decoder.yml from file /home/ubuntu/tasks/task_173232279266462/artifacts/model.npz.best-ce-mean-words.npz.decoder.yml with content encoding "gzip", mime type "application/x-yaml" and expiry 2025-11-17T22:50:31.650Z
[taskcluster 2024-11-23T01:17:51.794Z] Uploading artifact public/build/model.npz.optimizer.npz from file /home/ubuntu/tasks/task_173232279266462/artifacts/model.npz.optimizer.npz with content encoding "identity", mime type "application/octet-stream" and expiry 2025-11-17T22:50:31.650Z
[taskcluster 2024-11-23T01:17:51.794Z] Uploading artifact public/build/vocab.spm from file /home/ubuntu/tasks/task_173232279266462/artifacts/vocab.spm with content encoding "gzip", mime type "application/x-source-rpm" and expiry 2025-11-17T22:50:31.650Z
[taskcluster 2024-11-23T01:17:54.483Z] [mounts] Preserving cache: Moving "/home/ubuntu/tasks/task_173232279266462/checkouts" to "/home/ubuntu/caches/ArbA23bWSdWuN-TMpfP8jA"
[taskcluster 2024-11-23T01:17:54.566Z] Uploading link artifact public/logs/live.log to artifact public/logs/live_backing.log with expiry 2025-11-17T22:50:31.650Z