You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[2024-11-13 15:05:19,372] [opencompass.openicl.icl_retriever.icl_topk_retriever] [INFO] Creating index for index set...
0%| | 0/997 [00:00<?, ?it/s]
You're using a GPT2TokenizerFast tokenizer. Please note that with a fast tokenizer, using the __call__ method is faster than using a method to encode the text followed by a call
to the pad method to get a padded encoding.
[rank0]: Traceback (most recent call last):
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/bin/opencompass", line 8, in
[rank0]: sys.exit(main())
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/site-packages/opencompass/cli/main.py", line 308, in main
[rank0]: runner(tasks)
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/site-packages/opencompass/runners/base.py", line 38, in call
[rank0]: status = self.launch(tasks)
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/site-packages/opencompass/runners/local.py", line 128, in launch
[rank0]: task.run(cur_model=getattr(self, 'cur_model',
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/site-packages/opencompass/tasks/openicl_infer.py", line 88, in run
[rank0]: self._inference()
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/site-packages/opencompass/tasks/openicl_infer.py", line 106, in _inference
[rank0]: retriever = ICL_RETRIEVERS.build(retriever_cfg)
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
[rank0]: return self.build_func(cfg, *args, **kwargs, registry=self)
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
[rank0]: obj = obj_cls(**args) # type: ignore
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/site-packages/opencompass/openicl/icl_retriever/icl_topk_retriever.py", line 83, in init
[rank0]: self.index = self.create_index()
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/site-packages/opencompass/openicl/icl_retriever/icl_topk_retriever.py", line 99, in create_index
[rank0]: res_list = self.forward(dataloader,
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/site-packages/opencompass/openicl/icl_retriever/icl_topk_retriever.py", line 131, in forward
[rank0]: metadata = entry.pop('metadata')
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/_collections_abc.py", line 954, in pop
[rank0]: value = self[key]
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 270, in getitem
[rank0]: return self.data[item]
[rank0]: KeyError: 'metadata'
Other information
'''
with torch.no_grad():
metadata = entry.pop('metadata')
raw_text = self.tokenizer.batch_decode(
entry['input_ids'],
skip_special_tokens=True,
verbose=False)
res = self.model.encode(raw_text, show_progress_bar=False)
'''
In the source code, TopkRetriever calls the metadata for each data entry; however, when using the flores_gen_806ede configuration, the official datasets provided—such as flores—only contain the keys input_ids and attention_mask after processing.
The text was updated successfully, but these errors were encountered:
Prerequisite
Type
I'm evaluating with the officially supported tasks/models/datasets.
Environment
{'CUDA available': True,
'CUDA_HOME': '/home/nfs03/cuda_tools/cuda-12.1',
'GCC': 'gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0',
'GPU 0,1,2,3,4,5,6': 'NVIDIA GeForce RTX 3090',
'MMEngine': '0.10.5',
'MUSA available': False,
'NVCC': 'Cuda compilation tools, release 12.1, V12.1.105',
'OpenCV': '4.10.0',
'PyTorch': '2.4.0+cu121',
'PyTorch compiling details': 'PyTorch built with:\n'
' - GCC 9.3\n'
' - C++ Version: 201703\n'
' - Intel(R) oneAPI Math Kernel Library Version '
'2022.2-Product Build 20220804 for Intel(R) 64 '
'architecture applications\n'
' - Intel(R) MKL-DNN v3.4.2 (Git Hash '
'1137e04ec0b5251ca2b4400a4fd3c667ce843d67)\n'
' - OpenMP 201511 (a.k.a. OpenMP 4.5)\n'
' - LAPACK is enabled (usually provided by '
'MKL)\n'
' - NNPACK is enabled\n'
' - CPU capability usage: AVX512\n'
' - CUDA Runtime 12.1\n'
' - NVCC architecture flags: '
'-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gen
code;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90\n'
' - CuDNN 90.1 (built against CUDA 12.4)\n'
' - Magma 2.6.1\n'
' - Build settings: BLAS_INFO=mkl, '
'BUILD_TYPE=Release, CUDA_VERSION=12.1, '
'CUDNN_VERSION=9.1.0, '
'CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, '
'CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 '
'-fabi-version=11 -fvisibility-inlines-hidden '
'-DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO '
'-DLIBKINETO_NOROCTRACER -DUSE_FBGEMM '
'-DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK '
'-DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC '
'-Wall -Wextra -Werror=return-type '
'-Werror=non-virtual-dtor -Werror=bool-operation '
'-Wnarrowing -Wno-missing-field-initializers '
'-Wno-type-limits -Wno-array-bounds '
'-Wno-unknown-pragmas -Wno-unused-parameter '
'-Wno-unused-function -Wno-unused-result '
'-Wno-strict-overflow -Wno-strict-aliasing '
'-Wno-stringop-overflow -Wsuggest-override '
'-Wno-psabi -Wno-error=pedantic '
'-Wno-error=old-style-cast -Wno-missing-braces '
'-fdiagnostics-color=always -faligned-new '
'-Wno-unused-but-set-variable '
'-Wno-maybe-uninitialized -fno-math-errno '
'-fno-trapping-math -Werror=format '
'-Wno-stringop-overflow, LAPACK_INFO=mkl, '
'PERF_WITH_AVX=1, PERF_WITH_AVX2=1, '
'PERF_WITH_AVX512=1, TORCH_VERSION=2.4.0, '
'USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=1, '
'USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, '
'USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, '
'USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, '
'USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, '
'USE_ROCM_KERNEL_ASSERT=OFF, \n',
'Python': '3.10.0 (default, Mar 3 2022, 09:58:08) [GCC 7.5.0]',
'TorchVision': '0.19.0+cu121',
'lmdeploy': "not installed:No module named 'lmdeploy'",
'numpy_random_seed': 2147483648,
'opencompass': '0.3.5+',
'sys.platform': 'linux',
'transformers': '4.46.2'}
Reproduces the problem - code/configuration sample
from mmengine.config import read_base
from opencompass.models import VLLM
model_abbr = "Llama3_8B_Base"
num_gpus = 2
lora_path = None
seed = 0
max_seq_len = 4096
max_out_len = 100
batch_size = 32
temperature = 0.0
top_p = 0.8
max_tokens = 1024
with read_base():
from opencompass.configs.datasets.flores.flores_gen_806ede import flores_datasets
datasets = []
datasets += flores_datasets
models = [
dict(
type=VLLM,
abbr=model_abbr,
path="/home/nfs02/model/llama-3-8b-instruct",
model_kwargs=dict(tensor_parallel_size=num_gpus, dtype='bfloat16',
seed=seed, max_model_len=max_seq_len, enable_lora=True,),
max_out_len=max_out_len,
max_seq_len=max_seq_len,
batch_size=batch_size,
generation_kwargs=dict(temperature=temperature, top_p=top_p, max_tokens=max_tokens,),
stop_words=['<|end_of_text|>', '<|eot_id|>'],
lora_path=lora_path,
run_cfg=dict(num_gpus=num_gpus),
),
]
work_dir = './general_ability/Llama3/'
Reproduces the problem - command or script
opencompass x.py --debug
Reproduces the problem - error message
[2024-11-13 15:05:19,372] [opencompass.openicl.icl_retriever.icl_topk_retriever] [INFO] Creating index for index set...
0%| | 0/997 [00:00<?, ?it/s]
You're using a GPT2TokenizerFast tokenizer. Please note that with a fast tokenizer, using the
__call__
method is faster than using a method to encode the text followed by a callto the
pad
method to get a padded encoding.[rank0]: Traceback (most recent call last):
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/bin/opencompass", line 8, in
[rank0]: sys.exit(main())
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/site-packages/opencompass/cli/main.py", line 308, in main
[rank0]: runner(tasks)
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/site-packages/opencompass/runners/base.py", line 38, in call
[rank0]: status = self.launch(tasks)
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/site-packages/opencompass/runners/local.py", line 128, in launch
[rank0]: task.run(cur_model=getattr(self, 'cur_model',
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/site-packages/opencompass/tasks/openicl_infer.py", line 88, in run
[rank0]: self._inference()
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/site-packages/opencompass/tasks/openicl_infer.py", line 106, in _inference
[rank0]: retriever = ICL_RETRIEVERS.build(retriever_cfg)
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
[rank0]: return self.build_func(cfg, *args, **kwargs, registry=self)
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
[rank0]: obj = obj_cls(**args) # type: ignore
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/site-packages/opencompass/openicl/icl_retriever/icl_topk_retriever.py", line 83, in init
[rank0]: self.index = self.create_index()
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/site-packages/opencompass/openicl/icl_retriever/icl_topk_retriever.py", line 99, in create_index
[rank0]: res_list = self.forward(dataloader,
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/site-packages/opencompass/openicl/icl_retriever/icl_topk_retriever.py", line 131, in forward
[rank0]: metadata = entry.pop('metadata')
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/_collections_abc.py", line 954, in pop
[rank0]: value = self[key]
[rank0]: File "/home/nfs03/anaconda3/envs/LLMs_Eval_Opencompass/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 270, in getitem
[rank0]: return self.data[item]
[rank0]: KeyError: 'metadata'
Other information
'''
with torch.no_grad():
metadata = entry.pop('metadata')
raw_text = self.tokenizer.batch_decode(
entry['input_ids'],
skip_special_tokens=True,
verbose=False)
res = self.model.encode(raw_text, show_progress_bar=False)
'''
In the source code, TopkRetriever calls the metadata for each data entry; however, when using the flores_gen_806ede configuration, the official datasets provided—such as flores—only contain the keys input_ids and attention_mask after processing.
The text was updated successfully, but these errors were encountered: