Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to ICU tokenizer #939

Merged
merged 15 commits into from
Dec 21, 2024

Use ICU system package

d585a63
Select commit
Loading
Failed to load commit list.
Merged

Switch to ICU tokenizer #939

Use ICU system package
d585a63
Select commit
Loading
Failed to load commit list.
firefoxci-taskcluster / merge-translated-ru-en succeeded Nov 23, 2024 in 1h 14m 10s

FirefoxCI (pull_request)

merge translated for ru-en

Details

View task in Taskcluster | View logs in Taskcluster | View task group in Taskcluster

Task Status

Started: 2024-11-23T00:03:57.840Z
Resolved: 2024-11-23T00:04:42.263Z
Task Execution Time: 44 seconds, 423 milliseconds
Task Status: completed
Reason Resolved: completed
RunId: 0

Artifacts

- public/build/corpus.en.zst
- public/build/corpus.ru.zst
- public/logs/live_backing.log
- public/logs/live.log


[taskcluster 2024-11-23T00:03:57.760Z] Worker Type (translations-1/b-linux-large-gcp-1tb-32-256-d2g) settings:
[taskcluster 2024-11-23T00:03:57.760Z]   {
[taskcluster 2024-11-23T00:03:57.760Z]     "config": {
[taskcluster 2024-11-23T00:03:57.760Z]       "deploymentId": ""
[taskcluster 2024-11-23T00:03:57.760Z]     },
[taskcluster 2024-11-23T00:03:57.760Z]     "generic-worker": {
[taskcluster 2024-11-23T00:03:57.760Z]       "config": {
[taskcluster 2024-11-23T00:03:57.760Z]         "headlessTasks": true,
[taskcluster 2024-11-23T00:03:57.760Z]         "runTasksAsCurrentUser": false
[taskcluster 2024-11-23T00:03:57.760Z]       },
[taskcluster 2024-11-23T00:03:57.760Z]       "engine": "multiuser",
[taskcluster 2024-11-23T00:03:57.760Z]       "go-arch": "amd64",
[taskcluster 2024-11-23T00:03:57.760Z]       "go-os": "linux",
[taskcluster 2024-11-23T00:03:57.760Z]       "go-version": "go1.23.3",
[taskcluster 2024-11-23T00:03:57.760Z]       "release": "https://github.com/taskcluster/taskcluster/releases/tag/v75.0.0",
[taskcluster 2024-11-23T00:03:57.760Z]       "revision": "1f02e08d3ac9520fd636409dd44f97b9945bf882",
[taskcluster 2024-11-23T00:03:57.760Z]       "source": "https://github.com/taskcluster/taskcluster/commits/1f02e08d3ac9520fd636409dd44f97b9945bf882",
[taskcluster 2024-11-23T00:03:57.760Z]       "version": "75.0.0"
[taskcluster 2024-11-23T00:03:57.760Z]     },
[taskcluster 2024-11-23T00:03:57.760Z]     "image": "projects/taskcluster-imaging/global/images/gw-fxci-gcp-l1-2404-amd64-headless-googlecompute-alpha",

...(110 lines hidden)...

[taskcluster:warn 2024-11-23T00:03:57.765Z]   requires: all-completed
[taskcluster:warn 2024-11-23T00:03:57.765Z]   retries: 5
[taskcluster:warn 2024-11-23T00:03:57.765Z]   routes:
[taskcluster:warn 2024-11-23T00:03:57.765Z]   - index.translations.cache.pr.merge-translated.merge-translated-ru-en.hash.da0bca75d1ad9bef6623d6b64501544be31819ca08864617f7b01f59463ebe8b
[taskcluster:warn 2024-11-23T00:03:57.765Z]   - checks
[taskcluster:warn 2024-11-23T00:03:57.765Z]   schedulerId: translations-level-1
[taskcluster:warn 2024-11-23T00:03:57.765Z]   scopes:
[taskcluster:warn 2024-11-23T00:03:57.765Z]   - docker-worker:cache:translations-level-1-checkouts-v3-7afeb851dd97df8f3607-KnyIE1GvSz67R9mjL97Now
[taskcluster:warn 2024-11-23T00:03:57.765Z]   - generic-worker:cache:translations-level-1-checkouts-v3-7afeb851dd97df8f3607-KnyIE1GvSz67R9mjL97Now
[taskcluster:warn 2024-11-23T00:03:57.765Z]   - generic-worker:os-group:translations-1/b-linux-large-gcp-1tb-32-256-d2g/docker
[taskcluster:warn 2024-11-23T00:03:57.765Z]   tags:
[taskcluster:warn 2024-11-23T00:03:57.765Z]     createdForUser: [email protected]
[taskcluster:warn 2024-11-23T00:03:57.765Z]     kind: merge-translated
[taskcluster:warn 2024-11-23T00:03:57.765Z]     label: merge-translated-ru-en
[taskcluster:warn 2024-11-23T00:03:57.765Z]     os: linux
[taskcluster:warn 2024-11-23T00:03:57.765Z]     worker-implementation: docker-worker
[taskcluster:warn 2024-11-23T00:03:57.765Z]   taskGroupId: KygK5rmiQQSbJCq_hTsFbQ
[taskcluster:warn 2024-11-23T00:03:57.765Z]   taskQueueId: translations-1/b-linux-large-gcp-1tb-32-256-d2g
[taskcluster:warn 2024-11-23T00:03:57.765Z]   workerType: b-linux-large-gcp-1tb-32-256-d2g
[taskcluster:warn 2024-11-23T00:03:57.765Z] 
[taskcluster 2024-11-23T00:03:58.186Z] Uploading redirect artifact public/logs/live.log to URL https://firefoxci-websocktunnel.services.mozilla.com/us-central1-a.2876352915649111342.60099/log/VS7rNiVCQF6abIi8pp_YMQ with mime type "text/plain; charset=utf-8" and expiry 2024-11-24T00:33:58.117Z
[taskcluster 2024-11-23T00:03:58.603Z] [mounts] No existing writable directory cache 'translations-level-1-checkouts-v3-7afeb851dd97df8f3607-KnyIE1GvSz67R9mjL97Now' - creating caches/Syfv_SBwS_KSDeePaFMAww
[taskcluster 2024-11-23T00:03:58.603Z] [mounts] Creating directory /home/task_173232023300468/cache0
[taskcluster 2024-11-23T00:03:58.613Z] [mounts] Updating ownership of files inside directory '/home/task_173232023300468/cache0' from root to task_173232023300468
[taskcluster 2024-11-23T00:03:58.638Z] [mounts] Successfully mounted writable directory cache '/home/task_173232023300468/cache0'
[taskcluster 2024-11-23T00:03:58.638Z] [mounts] Downloading task KnyIE1GvSz67R9mjL97Now artifact public/image.tar.zst to downloads/EcIU3d6MQoiGILx34RHIRw
[taskcluster 2024-11-23T00:04:06.213Z] [mounts] Downloaded 301157895 bytes with SHA256 19e55d20166d6589bcb4948a5be88fce5ebbecce6e4a79da11150c8e1004e24c from task KnyIE1GvSz67R9mjL97Now artifact public/image.tar.zst to downloads/EcIU3d6MQoiGILx34RHIRw
[taskcluster:warn 2024-11-23T00:04:06.214Z] [mounts] Download downloads/EcIU3d6MQoiGILx34RHIRw of task KnyIE1GvSz67R9mjL97Now artifact public/image.tar.zst has SHA256 19e55d20166d6589bcb4948a5be88fce5ebbecce6e4a79da11150c8e1004e24c but task payload does not declare a required value, so content authenticity cannot be verified
[taskcluster 2024-11-23T00:04:06.214Z] [mounts] Creating directory /home/task_173232023300468
[taskcluster 2024-11-23T00:04:06.233Z] [mounts] Copying downloads/EcIU3d6MQoiGILx34RHIRw to /home/task_173232023300468/dockerimage
[taskcluster 2024-11-23T00:04:08.638Z] [mounts] Granting task_173232023300468 full control of file '/home/task_173232023300468/dockerimage'
[taskcluster 2024-11-23T00:04:08.639Z] Executing command 0: bash -cx 'IMAGE_ID=$(docker load --input dockerimage | sed -n '\''0,/^Loaded image: /s/^Loaded image: //p'\'')
[taskcluster 2024-11-23T00:04:08.639Z] timeout -s KILL 86400 docker run -t --name taskcontainer_ELY7yPCTR76Mu6_FzgiLbQ --memory-swap -1 --pids-limit -1 -v "$(pwd)/cache0:/builds/worker/checkouts" --add-host=taskcluster:127.0.0.1 --net=host -e FIREFOX_TRANSLATIONS_TRAINING_BASE_REPOSITORY -e FIREFOX_TRANSLATIONS_TRAINING_HEAD_REF -e FIREFOX_TRANSLATIONS_TRAINING_HEAD_REPOSITORY -e FIREFOX_TRANSLATIONS_TRAINING_HEAD_REV -e FIREFOX_TRANSLATIONS_TRAINING_REPOSITORY_TYPE -e HG_STORE_PATH -e MOZ_AUTOMATION -e MOZ_FETCHES -e MOZ_FETCHES_DIR -e MOZ_SCM_LEVEL -e REPOSITORIES -e RUN_ID -e SCCACHE_DISABLE -e SRC -e TASKCLUSTER_CACHES -e TASKCLUSTER_INSTANCE_TYPE -e TASKCLUSTER_PROXY_URL -e TASKCLUSTER_ROOT_URL -e TASKCLUSTER_UNTRUSTED_CACHES -e TASKCLUSTER_VOLUMES -e TASKCLUSTER_WORKER_LOCATION -e TASK_GROUP_ID -e TASK_ID -e TRG -e VCS_PATH "${IMAGE_ID}" /usr/local/bin/run-task --firefox_translations_training-checkout=/builds/worker/checkouts/vcs/ -- bash -c '\''export BIN=$MOZ_FETCHES_DIR && $VCS_PATH/pipeline/translate/merge-corpus.sh $MOZ_FETCHES_DIR/corpus.ru.zst $MOZ_FETCHES_DIR/mono.ru.zst $MOZ_FETCHES_DIR/corpus.en.zst $MOZ_FETCHES_DIR/mono.en.zst $TASK_WORKDIR/artifacts/corpus.ru.zst $TASK_WORKDIR/artifacts/corpus.en.zst'\''
[taskcluster 2024-11-23T00:04:08.639Z] exit_code=$?
[taskcluster 2024-11-23T00:04:08.639Z] docker cp taskcontainer_ELY7yPCTR76Mu6_FzgiLbQ:/builds/worker/artifacts artifact0
[taskcluster 2024-11-23T00:04:08.639Z] docker rm taskcontainer_ELY7yPCTR76Mu6_FzgiLbQ
[taskcluster 2024-11-23T00:04:08.639Z] exit "${exit_code}"'
++ docker load --input dockerimage
++ sed -n '0,/^Loaded image: /s/^Loaded image: //p'
+ IMAGE_ID=train:latest
++ pwd
+ timeout -s KILL 86400 docker run -t --name taskcontainer_ELY7yPCTR76Mu6_FzgiLbQ --memory-swap -1 --pids-limit -1 -v /home/task_173232023300468/cache0:/builds/worker/checkouts --add-host=taskcluster:127.0.0.1 --net=host -e FIREFOX_TRANSLATIONS_TRAINING_BASE_REPOSITORY -e FIREFOX_TRANSLATIONS_TRAINING_HEAD_REF -e FIREFOX_TRANSLATIONS_TRAINING_HEAD_REPOSITORY -e FIREFOX_TRANSLATIONS_TRAINING_HEAD_REV -e FIREFOX_TRANSLATIONS_TRAINING_REPOSITORY_TYPE -e HG_STORE_PATH -e MOZ_AUTOMATION -e MOZ_FETCHES -e MOZ_FETCHES_DIR -e MOZ_SCM_LEVEL -e REPOSITORIES -e RUN_ID -e SCCACHE_DISABLE -e SRC -e TASKCLUSTER_CACHES -e TASKCLUSTER_INSTANCE_TYPE -e TASKCLUSTER_PROXY_URL -e TASKCLUSTER_ROOT_URL -e TASKCLUSTER_UNTRUSTED_CACHES -e TASKCLUSTER_VOLUMES -e TASKCLUSTER_WORKER_LOCATION -e TASK_GROUP_ID -e TASK_ID -e TRG -e VCS_PATH train:latest /usr/local/bin/run-task --firefox_translations_training-checkout=/builds/worker/checkouts/vcs/ -- bash -c 'export BIN=$MOZ_FETCHES_DIR && $VCS_PATH/pipeline/translate/merge-corpus.sh $MOZ_FETCHES_DIR/corpus.ru.zst $MOZ_FETCHES_DIR/mono.ru.zst $MOZ_FETCHES_DIR/corpus.en.zst $MOZ_FETCHES_DIR/mono.en.zst $TASK_WORKDIR/artifacts/corpus.ru.zst $TASK_WORKDIR/artifacts/corpus.en.zst'
[setup 2024-11-23T00:04:29.711Z] run-task started in /builds/worker
[setup 2024-11-23T00:04:29.711Z] Invoked by command: --firefox_translations_training-checkout=/builds/worker/checkouts/vcs/ -- bash -c export BIN=$MOZ_FETCHES_DIR && $VCS_PATH/pipeline/translate/merge-corpus.sh $MOZ_FETCHES_DIR/corpus.ru.zst $MOZ_FETCHES_DIR/mono.ru.zst $MOZ_FETCHES_DIR/corpus.en.zst $MOZ_FETCHES_DIR/mono.en.zst $TASK_WORKDIR/artifacts/corpus.ru.zst $TASK_WORKDIR/artifacts/corpus.en.zst
[setup 2024-11-23T00:04:29.711Z] Python version: 3.10.12
[cache 2024-11-23T00:04:29.712Z] cache /builds/worker/checkouts is empty; writing requirements: gid=1000 uid=1000 version=1
[volume 2024-11-23T00:04:29.712Z] volume /builds/worker/checkouts is a cache
[setup 2024-11-23T00:04:29.712Z] running as worker:worker
[vcs 2024-11-23T00:04:29.713Z] executing ['git', 'config', '--global', '--add', 'safe.directory', '/builds/worker/checkouts/vcs']
[vcs 2024-11-23T00:04:29.714Z] executing ['git', 'clone', 'https://github.com/mozilla/translations', '/builds/worker/checkouts/vcs']
[vcs 2024-11-23T00:04:29.716Z] Cloning into '/builds/worker/checkouts/vcs'...
[vcs 2024-11-23T00:04:31.794Z] executing ['git', 'fetch', '--tags', '--force', 'https://github.com/mozilla/translations', 'icu_tokenizer']
[vcs 2024-11-23T00:04:31.970Z] From https://github.com/mozilla/translations
[vcs 2024-11-23T00:04:31.970Z]  * branch            icu_tokenizer -> FETCH_HEAD
[vcs 2024-11-23T00:04:31.977Z] executing ['git', 'fetch', '--no-tags', 'https://github.com/mozilla/translations', 'icu_tokenizer']
[vcs 2024-11-23T00:04:32.155Z] From https://github.com/mozilla/translations
[vcs 2024-11-23T00:04:32.156Z]  * branch            icu_tokenizer -> FETCH_HEAD
[vcs 2024-11-23T00:04:32.163Z] executing ['git', 'checkout', '-f', '-B', 'icu_tokenizer', 'd585a63a6abc04ece83e26ce51a0caa2f7fa21e6']
[vcs 2024-11-23T00:04:32.820Z] Switched to a new branch 'icu_tokenizer'
[vcs 2024-11-23T00:04:32.844Z] executing ['git', 'submodule', 'init']
[vcs 2024-11-23T00:04:32.868Z] Submodule '3rd_party/browsermt-marian-dev' (https://github.com/browsermt/marian-dev) registered for path '3rd_party/browsermt-marian-dev'
[vcs 2024-11-23T00:04:32.868Z] Submodule 'extract-lex' (https://github.com/marian-nmt/extract-lex) registered for path '3rd_party/extract-lex'
[vcs 2024-11-23T00:04:32.869Z] Submodule 'fast_align' (https://github.com/clab/fast_align) registered for path '3rd_party/fast_align'
[vcs 2024-11-23T00:04:32.869Z] Submodule '3rd_party/kenlm' (https://github.com/kpu/kenlm) registered for path '3rd_party/kenlm'
[vcs 2024-11-23T00:04:32.870Z] Submodule '3rd_party/marian-dev' (https://github.com/marian-nmt/marian-dev) registered for path '3rd_party/marian-dev'
[vcs 2024-11-23T00:04:32.871Z] Submodule '3rd_party/preprocess' (https://github.com/kpu/preprocess.git) registered for path '3rd_party/preprocess'
[vcs 2024-11-23T00:04:32.871Z] Submodule 'inference/3rd_party/browsermt-marian-dev' (https://github.com/browsermt/marian-dev) registered for path 'inference/3rd_party/browsermt-marian-dev'
[vcs 2024-11-23T00:04:32.872Z] Submodule 'inference/3rd_party/emsdk' (https://github.com/emscripten-core/emsdk.git) registered for path 'inference/3rd_party/emsdk'
[vcs 2024-11-23T00:04:32.873Z] Submodule 'inference/3rd_party/ssplit-cpp' (https://github.com/browsermt/ssplit-cpp) registered for path 'inference/3rd_party/ssplit-cpp'
[vcs 2024-11-23T00:04:32.873Z] executing ['git', 'submodule', 'update', '--force']
[vcs 2024-11-23T00:04:32.899Z] Cloning into '/builds/worker/checkouts/vcs/3rd_party/browsermt-marian-dev'...
[vcs 2024-11-23T00:04:34.118Z] Cloning into '/builds/worker/checkouts/vcs/3rd_party/extract-lex'...
[vcs 2024-11-23T00:04:34.449Z] Cloning into '/builds/worker/checkouts/vcs/3rd_party/fast_align'...
[vcs 2024-11-23T00:04:34.790Z] Cloning into '/builds/worker/checkouts/vcs/3rd_party/kenlm'...
[vcs 2024-11-23T00:04:35.443Z] Cloning into '/builds/worker/checkouts/vcs/3rd_party/marian-dev'...
[vcs 2024-11-23T00:04:36.912Z] Cloning into '/builds/worker/checkouts/vcs/3rd_party/preprocess'...
[vcs 2024-11-23T00:04:37.366Z] Cloning into '/builds/worker/checkouts/vcs/inference/3rd_party/browsermt-marian-dev'...
[vcs 2024-11-23T00:04:38.460Z] Cloning into '/builds/worker/checkouts/vcs/inference/3rd_party/emsdk'...
[vcs 2024-11-23T00:04:38.989Z] Cloning into '/builds/worker/checkouts/vcs/inference/3rd_party/ssplit-cpp'...
[vcs 2024-11-23T00:04:39.417Z] Submodule path '3rd_party/browsermt-marian-dev': checked out '11c6ae7c46be21ef96ed10c60f28022fa968939f'
[vcs 2024-11-23T00:04:39.433Z] Submodule path '3rd_party/extract-lex': checked out '42fa605b53f32eaf6c6e0b5677255c21c91b3d49'
[vcs 2024-11-23T00:04:39.449Z] Submodule path '3rd_party/fast_align': checked out 'cab1e9aac8d3bb02ff5ae58218d8d225a039fa11'
[vcs 2024-11-23T00:04:39.482Z] Submodule path '3rd_party/kenlm': checked out 'bbf4fc511266c5d4515047055d7bdec659a6e158'
[vcs 2024-11-23T00:04:39.606Z] Submodule path '3rd_party/marian-dev': checked out 'e8a1a2530fb84cbff7383302ebca393e5875c441'
[vcs 2024-11-23T00:04:39.633Z] Submodule path '3rd_party/preprocess': checked out '64307314b4d5a9a0bd529b5c1036b0710d995eec'
[vcs 2024-11-23T00:04:39.716Z] Submodule path 'inference/3rd_party/browsermt-marian-dev': checked out '2781d735d4a10dca876d61be587afdab2726293c'
[vcs 2024-11-23T00:04:39.741Z] Submodule path 'inference/3rd_party/emsdk': checked out '2346baa7bb44a4a0571cc75f1986ab9aaa35aa03'
[vcs 2024-11-23T00:04:39.760Z] Submodule path 'inference/3rd_party/ssplit-cpp': checked out 'a311f9865ade34db1e8e080e6cc146f55dafb067'
[vcs 2024-11-23T00:04:39.760Z] cleaning git checkout...
[vcs 2024-11-23T00:04:39.760Z] executing ['git', 'clean', '-nxdff']
[vcs 2024-11-23T00:04:39.764Z] removing []
[vcs 2024-11-23T00:04:39.764Z] successfully cleaned git checkout!
[vcs 2024-11-23T00:04:39.766Z] TinderboxPrint:<a href='https://github.com/mozilla/translations/commit/d585a63a6abc04ece83e26ce51a0caa2f7fa21e6' title='Built from translations commit d585a63a6abc04ece83e26ce51a0caa2f7fa21e6'>d585a63a6abc04ece83e26ce51a0caa2f7fa21e6</a>
[setup 2024-11-23T00:04:39.766Z] MOZ_FETCHES_DIR is /builds/worker/fetches
[fetches 2024-11-23T00:04:39.766Z] fetching artifacts
[fetches 2024-11-23T00:04:39.766Z] executing ['/usr/bin/python3', '-u', '/usr/local/bin/fetch-content', 'task-artifacts']
attempt 1/5attempt 1/5
Downloading https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/Xt1emcAkQJOYWTQ4C5sr3Q/artifacts/public/build/mono.en.zst to /builds/worker/fetches/mono.en.zst

Downloading https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/dH4MpUKGRhu5pycJq0qwcg/artifacts/public/build/corpus.en.zst to /builds/worker/fetches/corpus.en.zst
attempt 1/5
attempt 1/5Downloading https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/dH4MpUKGRhu5pycJq0qwcg/artifacts/public/build/corpus.en.zst

Downloading https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/CMk3dMWqQrOhiX5_lJExuQ/artifacts/public/build/corpus.ru.zst to /builds/worker/fetches/corpus.ru.zst
attempt 1/5
Downloading https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/RcSp41SmREiuG3yT1h6UHg/artifacts/public/build/dedupe.tar.zst to /builds/worker/fetches/dedupe.tar.zstDownloading https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/QRGwRMMlR2Chi_A4GizTMQ/artifacts/public/build/mono.ru.zst to /builds/worker/fetches/mono.ru.zst
Downloading https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/CMk3dMWqQrOhiX5_lJExuQ/artifacts/public/build/corpus.ru.zstDownloading https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/Xt1emcAkQJOYWTQ4C5sr3Q/artifacts/public/build/mono.en.zst

Downloading https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/QRGwRMMlR2Chi_A4GizTMQ/artifacts/public/build/mono.ru.zst

Downloading https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/RcSp41SmREiuG3yT1h6UHg/artifacts/public/build/dedupe.tar.zst
https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/CMk3dMWqQrOhiX5_lJExuQ/artifacts/public/build/corpus.ru.zst resolved to 49046 bytes with sha256 80474217965372eb308a256bc515ff120bf326ce945c69497647be9b77d94b6d in 0.115s
Verified size of https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/CMk3dMWqQrOhiX5_lJExuQ/artifacts/public/build/corpus.ru.zst
Extracting /builds/worker/fetches/corpus.ru.zst to /builds/worker/fetches
https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/RcSp41SmREiuG3yT1h6UHg/artifacts/public/build/dedupe.tar.zst resolved to 133246 bytes with sha256 6b021bdc0013dbd8e676afda47ef0a8eab66813947095922d605028bff5eb4a4 in 0.116s
Verified size of https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/RcSp41SmREiuG3yT1h6UHg/artifacts/public/build/dedupe.tar.zst
Extracting /builds/worker/fetches/dedupe.tar.zst to /builds/worker/fetches
/builds/worker/fetches/dedupe.tar.zst extracted in 0.003s
Removing /builds/worker/fetches/dedupe.tar.zst
https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/Xt1emcAkQJOYWTQ4C5sr3Q/artifacts/public/build/mono.en.zst resolved to 11190 bytes with sha256 94c73e9f49ba2efcd2570d0fadf4f9af8675c957821b5b3e336cc18d7ebf170e in 0.194s
Verified size of https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/Xt1emcAkQJOYWTQ4C5sr3Q/artifacts/public/build/mono.en.zst
Extracting /builds/worker/fetches/mono.en.zst to /builds/worker/fetches
https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/dH4MpUKGRhu5pycJq0qwcg/artifacts/public/build/corpus.en.zst resolved to 16316 bytes with sha256 2406049ef20a11eca86d469bbcd2cbf11476875c43a9f980d99feb31167d63d2 in 0.199s
Verified size of https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/dH4MpUKGRhu5pycJq0qwcg/artifacts/public/build/corpus.en.zst
Extracting /builds/worker/fetches/corpus.en.zst to /builds/worker/fetches
https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/QRGwRMMlR2Chi_A4GizTMQ/artifacts/public/build/mono.ru.zst resolved to 37563 bytes with sha256 304e08e5d74cb9ea8c5df020122b5ed56472d0b41a2848aecbd08b91d788b528 in 0.222s
Verified size of https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/QRGwRMMlR2Chi_A4GizTMQ/artifacts/public/build/mono.ru.zst
Extracting /builds/worker/fetches/mono.ru.zst to /builds/worker/fetches
PERFHERDER_DATA: {"framework": {"name": "build_metrics"}, "suites": [{"name": "fetch_content", "value": 0.22702541400001053, "lowerIsBetter": true, "shouldAlert": false, "subtests": []}]}
[fetches 2024-11-23T00:04:40.076Z] finished fetching artifacts
[task 2024-11-23T00:04:40.076Z] executing ['bash', '-c', 'export BIN=$MOZ_FETCHES_DIR && $VCS_PATH/pipeline/translate/merge-corpus.sh $MOZ_FETCHES_DIR/corpus.ru.zst $MOZ_FETCHES_DIR/mono.ru.zst $MOZ_FETCHES_DIR/corpus.en.zst $MOZ_FETCHES_DIR/mono.en.zst $TASK_WORKDIR/artifacts/corpus.ru.zst $TASK_WORKDIR/artifacts/corpus.en.zst']
[task 2024-11-23T00:04:40.079Z] + set -euo pipefail
[task 2024-11-23T00:04:40.079Z] + test -v BIN
[task 2024-11-23T00:04:40.079Z] + echo '###### Merging datasets'
[task 2024-11-23T00:04:40.079Z] ###### Merging datasets
[task 2024-11-23T00:04:40.079Z] + src1=/builds/worker/fetches/corpus.ru.zst
[task 2024-11-23T00:04:40.079Z] + src2=/builds/worker/fetches/mono.ru.zst
[task 2024-11-23T00:04:40.079Z] + trg1=/builds/worker/fetches/corpus.en.zst
[task 2024-11-23T00:04:40.079Z] + trg2=/builds/worker/fetches/mono.en.zst
[task 2024-11-23T00:04:40.079Z] + res_src=/builds/worker/artifacts/corpus.ru.zst
[task 2024-11-23T00:04:40.079Z] + res_trg=/builds/worker/artifacts/corpus.en.zst
[task 2024-11-23T00:04:40.079Z] ++ dirname /builds/worker/artifacts/corpus.ru.zst
[task 2024-11-23T00:04:40.080Z] + tmp_dir=/builds/worker/artifacts/tmp
[task 2024-11-23T00:04:40.080Z] + mkdir -p /builds/worker/artifacts/tmp
[task 2024-11-23T00:04:40.082Z] + zstdmt
[task 2024-11-23T00:04:40.082Z] ++ zstdmt -dc /builds/worker/fetches/corpus.ru.zst
[task 2024-11-23T00:04:40.082Z] + cat /dev/fd/63 /dev/fd/62
[task 2024-11-23T00:04:40.082Z] ++ zstdmt -dc /builds/worker/fetches/mono.ru.zst
[task 2024-11-23T00:04:40.089Z] + zstdmt
[task 2024-11-23T00:04:40.089Z] ++ zstdmt -dc /builds/worker/fetches/corpus.en.zst
[task 2024-11-23T00:04:40.089Z] + cat /dev/fd/63 /dev/fd/62
[task 2024-11-23T00:04:40.089Z] ++ zstdmt -dc /builds/worker/fetches/mono.en.zst
[task 2024-11-23T00:04:40.096Z] + echo '#### Deduplicating'
[task 2024-11-23T00:04:40.096Z] #### Deduplicating
[task 2024-11-23T00:04:40.097Z] + /builds/worker/fetches/dedupe
[task 2024-11-23T00:04:40.097Z] ++ zstdmt -dc /builds/worker/artifacts/tmp/original.src.zst
[task 2024-11-23T00:04:40.097Z] + shuf --random-source=/dev/fd/63
[task 2024-11-23T00:04:40.097Z] + paste /dev/fd/63 /dev/fd/62
[task 2024-11-23T00:04:40.097Z] + zstdmt
[task 2024-11-23T00:04:40.097Z] ++ get_seeded_random 42
[task 2024-11-23T00:04:40.097Z] ++ seed=42
[task 2024-11-23T00:04:40.097Z] ++ openssl enc -aes-256-ctr -pass pass:42 -nosalt
[task 2024-11-23T00:04:40.097Z] ++ zstdmt -dc /builds/worker/artifacts/tmp/original.trg.zst
[task 2024-11-23T00:04:40.099Z] File stdin isn't normal.  Using slower read() instead of mmap().  No progress bar.
[task 2024-11-23T00:04:40.110Z] Kept 1691 / 1693 = 0.998819
[task 2024-11-23T00:04:40.115Z] + zstdmt -dc /builds/worker/artifacts/tmp/all.zst
[task 2024-11-23T00:04:40.116Z] + cut -f1
[task 2024-11-23T00:04:40.116Z] + zstdmt
[task 2024-11-23T00:04:40.126Z] + zstdmt -dc /builds/worker/artifacts/tmp/all.zst
[task 2024-11-23T00:04:40.126Z] + cut -f2
[task 2024-11-23T00:04:40.126Z] + zstdmt
[task 2024-11-23T00:04:40.137Z] ++ zstdmt -dc /builds/worker/artifacts/corpus.ru.zst
[task 2024-11-23T00:04:40.138Z] ++ wc -l
[task 2024-11-23T00:04:40.140Z] + src_len=1691
[task 2024-11-23T00:04:40.141Z] ++ zstdmt -dc /builds/worker/artifacts/corpus.en.zst
[task 2024-11-23T00:04:40.141Z] ++ wc -l
[task 2024-11-23T00:04:40.145Z] + trg_len=1691
[task 2024-11-23T00:04:40.145Z] + '[' 1691 '!=' 1691 ']'
[task 2024-11-23T00:04:40.145Z] + rm -rf /builds/worker/artifacts/tmp
[task 2024-11-23T00:04:40.146Z] + echo '###### Done: Merging datasets'
[task 2024-11-23T00:04:40.146Z] ###### Done: Merging datasets
[fetches 2024-11-23T00:04:40.146Z] removing /builds/worker/fetches
[fetches 2024-11-23T00:04:40.147Z] finished
+ exit_code=0
+ docker cp taskcontainer_ELY7yPCTR76Mu6_FzgiLbQ:/builds/worker/artifacts artifact0
+ docker rm taskcontainer_ELY7yPCTR76Mu6_FzgiLbQ
taskcontainer_ELY7yPCTR76Mu6_FzgiLbQ
+ exit 0
[taskcluster 2024-11-23T00:04:41.660Z]    Exit Code: 0
[taskcluster 2024-11-23T00:04:41.660Z]    User Time: 99.083ms
[taskcluster 2024-11-23T00:04:41.660Z]  Kernel Time: 250.659ms
[taskcluster 2024-11-23T00:04:41.660Z]    Wall Time: 33.020569096s
[taskcluster 2024-11-23T00:04:41.660Z]       Result: SUCCEEDED
[taskcluster 2024-11-23T00:04:41.661Z] === Task Finished ===
[taskcluster 2024-11-23T00:04:41.661Z] Task Duration: 33.021191038s
[taskcluster 2024-11-23T00:04:41.777Z] Uploading artifact public/build/corpus.en.zst from file /home/task_173232023300468/artifact0/corpus.en.zst with content encoding "identity", mime type "application/zstd" and expiry 2025-11-17T22:50:31.174Z
[taskcluster 2024-11-23T00:04:41.786Z] Uploading artifact public/build/corpus.ru.zst from file /home/task_173232023300468/artifact0/corpus.ru.zst with content encoding "identity", mime type "application/zstd" and expiry 2025-11-17T22:50:31.174Z
[taskcluster 2024-11-23T00:04:41.931Z] [mounts] Preserving cache: Moving "/home/task_173232023300468/cache0" to "caches/Syfv_SBwS_KSDeePaFMAww"
[taskcluster 2024-11-23T00:04:42.007Z] Uploading link artifact public/logs/live.log to artifact public/logs/live_backing.log with expiry 2025-11-17T22:50:31.174Z