Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

504 Server error when running comet-score using multiple machines #162

Open
Smu-Tan opened this issue Aug 23, 2023 · 10 comments
Open

504 Server error when running comet-score using multiple machines #162

Smu-Tan opened this issue Aug 23, 2023 · 10 comments
Labels
bug Something isn't working

Comments

@Smu-Tan
Copy link

Smu-Tan commented Aug 23, 2023

馃悰 Bug

Hi! A 504 server error is encountered when running multiple comet-score scripts. See below:

Traceback (most recent call last): File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py, line 261, in hf_raise_for_status response.raise_for_status() File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/requests/models.py, line 1021, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 504 Server Error: Gateway Time-out for url: https://huggingface.co/api/models/Unbabel/wmt22-comet-da/revision/main

The above exception was the direct cause of the following exception: Traceback (most recent call last): File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/comet/models/__init__.py, line 46, in download_model model_path = snapshot_download( File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py, line 118, in _inner_fn return fn(*args, **kwargs) File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/huggingface_hub/_snapshot_download.py, line 186, in snapshot_download repo_info = api.repo_info(repo_id=repo_id, repo_type=repo_type, revision=revision, token=token) File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py, line 118, in _inner_fn return fn(*args, **kwargs) File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/huggingface_hub/hf_api.py, line 1868, in repo_info return method( File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py, line 118, in _inner_fn return fn(*args, **kwargs) File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/huggingface_hub/hf_api.py, line 1678, in model_info hf_raise_for_status(r) File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py, line 303, in hf_raise_for_status raise HfHubHTTPError(str(e), response=response) from e huggingface_hub.utils._errors.HfHubHTTPError: 504 Server Error: Gateway Time-out for url: https://huggingface.co/api/models/Unbabel/wmt22-comet-da/revision/main

During handling of the above exception, another exception occurred: Traceback (most recent call last): File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/comet/models/__init__.py, line 51, in download_model checkpoint_path = download_model_legacy(model, saving_directory) File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/comet/models/download_utils.py, line 224, in download_model_legacy raise Exception( Exception: Unbabel/wmt22-comet-da is not in the available_legacy_metrics or is a valid checkpoint folder.

During handling of the above exception, another exception occurred: Traceback (most recent call last): File /home/stan1/anaconda3/envs/prefix_mt/bin/comet-score, line 8, in <module> sys.exit(score_command()) File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/comet/cli/score.py, line 154, in score_command model_path = download_model(cfg.model, saving_directory=cfg.model_storage_path) File /home/stan1/anaconda3/envs/prefix_mt/lib/python3.9/site-packages/comet/models/__init__.py, line 53, in download_model raise KeyError(fModel {model} not supported by COMET.) KeyError: Model Unbabel/wmt22-comet-da not supported by COMET.

To Reproduce

Here's the reproduction code template, pls ignore the task and seed setting.

#!/bin/bash

RESULT_DIR=zero-shot

TASKS=(zs)
SEEDS=(1234)
SRCAR=('de' 'nl' 'sv' 'da' 'is')
TGTAR=('de' 'nl' 'sv' 'da' 'is')

for (( t=0; t<${#TASKS[@]}; t++ ))
do
for (( s=0; s<${#SEEDS[@]}; s++ ))
do
first_id=$((t*${#SEEDS[@]}+s))
for (( i=0; i<${#SRCAR[@]}; i++ ))
do
second_id=$((first_id*${#SRCAR[@]}+i))
for (( j=0; j<${#TGTAR[@]}; j++ ))
do
third_id=$((second_id*${#TGTAR[@]}+j))

if [ "$third_id" -eq "$SLURM_ARRAY_TASK_ID" ]
then

SRC=${SRCAR[i]}
TGT=${TGTAR[j]}

if [[ "$SRC" != "$TGT" ]]
then

echo "SRC-TGT: $SRC-$TGT"

SOURCE_SENT=${RESULT_DIR}/${SRC}-${TGT}/test-src.txt
HYPOTHESIS=${RESULT_DIR}/${SRC}-${TGT}/test-sys.txt
REFERENCE=${RESULT_DIR}/${SRC}-${TGT}/test-ref.txt
comet-score -s ${SOURCE_SENT} -t ${HYPOTHESIS} -r ${REFERENCE} --quiet --only_system > ${RESULT_DIR}/${SRC}-${TGT}/test_comet.txt

fi
fi

done
done
done
done

Environment

OS: Linux (slurm)
comet version: newest

@Smu-Tan Smu-Tan added the bug Something isn't working label Aug 23, 2023
@ricardorei
Copy link
Collaborator

Hmm this seems to be a problem downloading the model and on HF side. Have you tried it recently?

@ricardorei
Copy link
Collaborator

it could be that HF Hub was down for a period

@haroon830
Copy link

@Smu-Tan have you solved your problem?? I'm getting the same error of downloading the model.

@weichuanW
Copy link

weichuanW commented Dec 12, 2023

@ricardorei Hi, I run the code

from comet import download_model, load_from_checkpoint
model_path = download_model("Unbabel/XCOMET-XL")

and get this exception:

Exception: Unbabel/XCOMET-XL is not in the available_legacy_metrics or is a valid checkpoint folder.

After checking this file, I found the available_legacy_metrics in comet/models/download_utils.py does not have the corresponding key-value pair. Can you update this file or tell me the way to directly download it on the HF?

the current version of unbabel-comet is 2.2.0
Best.

@ricardorei
Copy link
Collaborator

Hey! Hmm this is weird. available_legacy_metrics should just be called when the model is not found on Hugging face. What is your hugging face hub version? can you send me the pip freeze output?

@weichuanW
Copy link

OK, the following is the pip freeze list:
accelerate==0.23.0
aeidon==1.12
aiofiles==23.2.1
aiohttp==3.8.6
aiosignal==1.3.1
altair==5.2.0
annotated-types==0.6.0
antlr4-python3-runtime==4.8
anyio==3.7.1
argh==0.30.2
async-timeout==4.0.3
atomicwrites==1.4.1
attrs==23.1.0
beautifulsoup4==4.12.2
bitarray==2.8.3
bitsandbytes==0.41.1
blessed==1.20.0
blis==0.7.11
catalogue==2.0.10
certifi==2022.12.7
cffi==1.16.0
chardet==5.2.0
charset-normalizer==2.0.12
cheroot==10.0.0
chinese-converter==1.1.1
click==8.1.7
cloudpathlib==0.16.0
cmake==3.25.0
colorama==0.4.6
coloredlogs==10.0
confection==0.1.3
contourpy==1.2.0
coverage==4.5.4
cycler==0.12.1
cymem==2.0.8
Cython==3.0.5
datasets==2.14.5
dill==0.3.7
distro==1.8.0
docstring-parser==0.15
docx2txt==0.8
einops==0.7.0
en-core-web-lg @ https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-3.7.0/en_core_web_lg-3.7.0-py3-none-any.whl#sha256=708da1110fbe1163d059de34a2cbedb1db65c26e1e624ca925897a2711cb7d77
en-core-web-sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.0/en_core_web_sm-3.7.0-py3-none-any.whl#sha256=6215d71a3212690e9aec49408a27e3fe6ad7cd6c715476e93d70dc784041e93e
enlighten==1.10.1
entmax==1.1
evaluate==0.4.1
exceptiongroup==1.1.3
fairseq==0.12.2
faiss==1.7.4
fastapi==0.104.1
fastbm25==0.0.2
fastBPE==0.1.1
fastest==0.3.1
fasttext==0.9.2
ffmpy==0.3.1
filelock==3.9.0
fluent.syntax==0.19.0
fonttools==4.44.0
frozenlist==1.4.0
fsspec==2023.6.0
gcld3==3.0.13
gradio==4.8.0
gradio_client==0.7.1
h11==0.14.0
httpcore==1.0.2
httpx==0.25.2
huggingface-hub==0.16.4
humanfriendly==10.0
hydra-core==1.0.7
icu==0.0.1
idna==3.4
importlib-resources==6.1.1
iniconfig==2.0.0
iniparse==0.5
jaraco.functools==3.9.0
Jinja2==3.1.2
joblib==1.3.2
jsonargparse==3.13.1
jsonschema==4.20.0
jsonschema-specifications==2023.11.2
kiwisolver==1.4.5
langcodes==3.3.0
langdetect==1.0.9
latexcodec==2.0.1
Levenshtein==0.23.0
lightning-utilities==0.9.0
lingua-language-detector==1.3.3
lit==15.0.7
lxml==4.9.3
markdown-it-py==3.0.0
MarkupSafe==2.1.2
matplotlib==3.8.1
mdurl==0.1.2
mistletoe==1.2.1
more-itertools==10.1.0
mpmath==1.3.0
mtdata==0.4.0
multidict==6.0.4
multiprocess==0.70.15
murmurhash==1.0.10
networkx==3.0
numpy==1.24.4
nvidia-cublas-cu11==11.10.3.66
nvidia-cuda-cupti-cu11==11.7.101
nvidia-cuda-nvrtc-cu11==11.7.99
nvidia-cuda-runtime-cu11==11.7.99
nvidia-cudnn-cu11==8.5.0.96
nvidia-cufft-cu11==10.9.0.58
nvidia-curand-cu11==10.2.10.91
nvidia-cusolver-cu11==11.4.0.1
nvidia-cusparse-cu11==11.7.4.91
nvidia-nccl-cu11==2.14.3
nvidia-nvtx-cu11==11.7.91
omegaconf==2.0.6
optimum==1.13.2
orjson==3.9.10
packaging==23.2
pandas==2.1.1
pathtools==0.1.2
peft @ git+https://github.com/huggingface/peft@56556faa17263be8ef1802c172141705b71c28dc
phply==1.2.6
Pillow==9.3.0
pluggy==0.13.1
ply==3.11
polyglot==16.7.4
portalocker==2.3.0
prefixed==0.7.0
preshed==3.0.9
protobuf==4.24.4
psutil==5.9.6
py==1.11.0
pyarrow==13.0.0
pybind11==2.11.1
pybtex==0.24.0
pycld2==0.42
pycparser==2.21
pydantic==2.4.2
pydantic_core==2.10.1
pydub==0.25.1
pyenchant==3.2.2
Pygments==2.17.2
PyICU==2.11
pyparsing==3.1.1
pytest==4.6.11
pytest-cov==2.10.1
python-dateutil==2.8.2
python-Levenshtein==0.23.0
python-multipart==0.0.6
pytorch-lightning==2.1.0
pytz==2023.3.post1
PyYAML==6.0.1
rank-bm25==0.2.2
rapidfuzz==3.4.0
referencing==0.32.0
regex==2023.10.3
requests==2.28.1
responses==0.18.0
rich==13.7.0
rpds-py==0.13.2
ruamel.yaml==0.17.32
ruamel.yaml.clib==0.2.8
sacrebleu==2.3.1
sacremoses==0.0.53
safetensors==0.4.0
scikit-build==0.17.6
scipy==1.11.3
seaborn==0.13.0
semantic-version==2.10.0
sentencepiece==0.1.99
shellingham==1.5.4
shtab==1.6.4
six==1.16.0
smart-open==6.4.0
sniffio==1.3.0
soupsieve==2.5
spacy==3.7.2
spacy-language-detection==0.2.1
spacy-legacy==3.0.12
spacy-loggers==1.0.5
srsly==2.4.8
starlette==0.27.0
sympy==1.12
tabulate==0.9.0
thinc==8.2.1
tokenizers==0.14.1
tomli==2.0.1
tomlkit==0.12.0
toolz==0.12.0
torch==2.0.1
torchaudio==2.0.2+cu117
torchmetrics==0.10.3
torchvision==0.15.2+cu117
tqdm==4.66.1
transformers==4.34.1
translate-toolkit==3.10.1
transliterate==1.10.2
triton==2.0.0
trl==0.7.4
typer==0.9.0
typing_extensions==4.8.0
tyro==0.5.17
tzdata==2023.3
unbabel-comet==2.2.0
urllib3==1.26.13
uvicorn==0.24.0.post1
vobject==0.9.6.1
wasabi==1.1.2
watchdog==0.9.0
wcwidth==0.2.8
weasel==0.3.3
websockets==11.0.3
wmtformat @ git+https://github.com/wmt-conference/wmt-format-tools.git@49983f17d8c99207c66a7f43fa49aa71d0692e48
xxhash==3.4.1
yarl==1.9.2
zhon==2.0.2

the hugging face hub version is huggingface-hub==0.16.4, I upgrade it to huggingface-hub-0.19.4 but still not work with the same error:)

@weichuanW
Copy link

OK, the following is the pip freeze list: accelerate==0.23.0 aeidon==1.12 aiofiles==23.2.1 aiohttp==3.8.6 aiosignal==1.3.1 altair==5.2.0 annotated-types==0.6.0 antlr4-python3-runtime==4.8 anyio==3.7.1 argh==0.30.2 async-timeout==4.0.3 atomicwrites==1.4.1 attrs==23.1.0 beautifulsoup4==4.12.2 bitarray==2.8.3 bitsandbytes==0.41.1 blessed==1.20.0 blis==0.7.11 catalogue==2.0.10 certifi==2022.12.7 cffi==1.16.0 chardet==5.2.0 charset-normalizer==2.0.12 cheroot==10.0.0 chinese-converter==1.1.1 click==8.1.7 cloudpathlib==0.16.0 cmake==3.25.0 colorama==0.4.6 coloredlogs==10.0 confection==0.1.3 contourpy==1.2.0 coverage==4.5.4 cycler==0.12.1 cymem==2.0.8 Cython==3.0.5 datasets==2.14.5 dill==0.3.7 distro==1.8.0 docstring-parser==0.15 docx2txt==0.8 einops==0.7.0 en-core-web-lg @ https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-3.7.0/en_core_web_lg-3.7.0-py3-none-any.whl#sha256=708da1110fbe1163d059de34a2cbedb1db65c26e1e624ca925897a2711cb7d77 en-core-web-sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.0/en_core_web_sm-3.7.0-py3-none-any.whl#sha256=6215d71a3212690e9aec49408a27e3fe6ad7cd6c715476e93d70dc784041e93e enlighten==1.10.1 entmax==1.1 evaluate==0.4.1 exceptiongroup==1.1.3 fairseq==0.12.2 faiss==1.7.4 fastapi==0.104.1 fastbm25==0.0.2 fastBPE==0.1.1 fastest==0.3.1 fasttext==0.9.2 ffmpy==0.3.1 filelock==3.9.0 fluent.syntax==0.19.0 fonttools==4.44.0 frozenlist==1.4.0 fsspec==2023.6.0 gcld3==3.0.13 gradio==4.8.0 gradio_client==0.7.1 h11==0.14.0 httpcore==1.0.2 httpx==0.25.2 huggingface-hub==0.16.4 humanfriendly==10.0 hydra-core==1.0.7 icu==0.0.1 idna==3.4 importlib-resources==6.1.1 iniconfig==2.0.0 iniparse==0.5 jaraco.functools==3.9.0 Jinja2==3.1.2 joblib==1.3.2 jsonargparse==3.13.1 jsonschema==4.20.0 jsonschema-specifications==2023.11.2 kiwisolver==1.4.5 langcodes==3.3.0 langdetect==1.0.9 latexcodec==2.0.1 Levenshtein==0.23.0 lightning-utilities==0.9.0 lingua-language-detector==1.3.3 lit==15.0.7 lxml==4.9.3 markdown-it-py==3.0.0 MarkupSafe==2.1.2 matplotlib==3.8.1 mdurl==0.1.2 mistletoe==1.2.1 more-itertools==10.1.0 mpmath==1.3.0 mtdata==0.4.0 multidict==6.0.4 multiprocess==0.70.15 murmurhash==1.0.10 networkx==3.0 numpy==1.24.4 nvidia-cublas-cu11==11.10.3.66 nvidia-cuda-cupti-cu11==11.7.101 nvidia-cuda-nvrtc-cu11==11.7.99 nvidia-cuda-runtime-cu11==11.7.99 nvidia-cudnn-cu11==8.5.0.96 nvidia-cufft-cu11==10.9.0.58 nvidia-curand-cu11==10.2.10.91 nvidia-cusolver-cu11==11.4.0.1 nvidia-cusparse-cu11==11.7.4.91 nvidia-nccl-cu11==2.14.3 nvidia-nvtx-cu11==11.7.91 omegaconf==2.0.6 optimum==1.13.2 orjson==3.9.10 packaging==23.2 pandas==2.1.1 pathtools==0.1.2 peft @ git+https://github.com/huggingface/peft@56556faa17263be8ef1802c172141705b71c28dc phply==1.2.6 Pillow==9.3.0 pluggy==0.13.1 ply==3.11 polyglot==16.7.4 portalocker==2.3.0 prefixed==0.7.0 preshed==3.0.9 protobuf==4.24.4 psutil==5.9.6 py==1.11.0 pyarrow==13.0.0 pybind11==2.11.1 pybtex==0.24.0 pycld2==0.42 pycparser==2.21 pydantic==2.4.2 pydantic_core==2.10.1 pydub==0.25.1 pyenchant==3.2.2 Pygments==2.17.2 PyICU==2.11 pyparsing==3.1.1 pytest==4.6.11 pytest-cov==2.10.1 python-dateutil==2.8.2 python-Levenshtein==0.23.0 python-multipart==0.0.6 pytorch-lightning==2.1.0 pytz==2023.3.post1 PyYAML==6.0.1 rank-bm25==0.2.2 rapidfuzz==3.4.0 referencing==0.32.0 regex==2023.10.3 requests==2.28.1 responses==0.18.0 rich==13.7.0 rpds-py==0.13.2 ruamel.yaml==0.17.32 ruamel.yaml.clib==0.2.8 sacrebleu==2.3.1 sacremoses==0.0.53 safetensors==0.4.0 scikit-build==0.17.6 scipy==1.11.3 seaborn==0.13.0 semantic-version==2.10.0 sentencepiece==0.1.99 shellingham==1.5.4 shtab==1.6.4 six==1.16.0 smart-open==6.4.0 sniffio==1.3.0 soupsieve==2.5 spacy==3.7.2 spacy-language-detection==0.2.1 spacy-legacy==3.0.12 spacy-loggers==1.0.5 srsly==2.4.8 starlette==0.27.0 sympy==1.12 tabulate==0.9.0 thinc==8.2.1 tokenizers==0.14.1 tomli==2.0.1 tomlkit==0.12.0 toolz==0.12.0 torch==2.0.1 torchaudio==2.0.2+cu117 torchmetrics==0.10.3 torchvision==0.15.2+cu117 tqdm==4.66.1 transformers==4.34.1 translate-toolkit==3.10.1 transliterate==1.10.2 triton==2.0.0 trl==0.7.4 typer==0.9.0 typing_extensions==4.8.0 tyro==0.5.17 tzdata==2023.3 unbabel-comet==2.2.0 urllib3==1.26.13 uvicorn==0.24.0.post1 vobject==0.9.6.1 wasabi==1.1.2 watchdog==0.9.0 wcwidth==0.2.8 weasel==0.3.3 websockets==11.0.3 wmtformat @ git+https://github.com/wmt-conference/wmt-format-tools.git@49983f17d8c99207c66a7f43fa49aa71d0692e48 xxhash==3.4.1 yarl==1.9.2 zhon==2.0.2

the hugging face hub version is huggingface-hub==0.16.4, I upgrade it to huggingface-hub-0.19.4 but still not work with the same error:)


The problem was solved by manually downloading the model from huggingface repo. Thx.

@mohataher
Copy link

You have to acknowledge the model's license on the web. Then perform a cli login on your code before downloading it.

@ricardorei
Copy link
Collaborator

I forgot this issue. Thanks for answering @mohataher.

@laelhalawani
Copy link

laelhalawani commented Jun 19, 2024

SOLVED - had the same issue
Unbabel/wmt23-cometkiwi-da-xl' not supported by COMET
it turned out to be issue with loging to huggingface.
If you have it installed go to huggingface.co/settings/tokens to generate your token then
huggingface-cli login and paste in the token
Now if you run the code again it should successfully download the model

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants