Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: negative shift count #360

Open
wiany11 opened this issue Jul 23, 2024 · 0 comments
Open

ValueError: negative shift count #360

wiany11 opened this issue Jul 23, 2024 · 0 comments

Comments

@wiany11
Copy link

wiany11 commented Jul 23, 2024

from colbert.infra import Run, RunConfig, ColBERTConfig
from colbert import Indexer

if __name__=='__main__':
    with Run().context(RunConfig(nranks=4, experiment="msmarco")):

        config = ColBERTConfig(
            nbits=16,
            root=".",
        )
        indexer = Indexer(checkpoint="./colbertv2.0", config=config)
        indexer.index(name="msmarco.nbits=16", collection="./collection.tsv")

If I try to set nbits to 16 and index, I get the following error.

Clustering 35121672 points in 128D to 262144 clusters, redo 1 times, 20 iterations
  Preprocessing in 9.63 s
  Iteration 19 (7666.62 s, search 7389.69 s): objective=8.38992e+06 imbalance=1.247 nsplit=0       
[Jul 23, 04:56:30] Loading decompress_residuals_cpp extension (set COLBERT_LOAD_TORCH_EXTENSION_VERBOSE=True for more info)...
[Jul 23, 04:56:34] Loading packbits_cpp extension (set COLBERT_LOAD_TORCH_EXTENSION_VERBOSE=True for more info)...
Process Process-2:
Traceback (most recent call last):
  File "/home/jovyan/.conda/envs/colbert/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/home/jovyan/.conda/envs/colbert/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/jovyan/colbert-prune/experiments/ColBERTv2-reproduce/ColBERT/colbert/infra/launcher.py", line 134, in setup_new_process
    return_val = callee(config, *args)
  File "/home/jovyan/colbert-prune/experiments/ColBERTv2-reproduce/ColBERT/colbert/indexing/collection_indexer.py", line 33, in encode
    encoder.run(shared_lists)
  File "/home/jovyan/colbert-prune/experiments/ColBERTv2-reproduce/ColBERT/colbert/indexing/collection_indexer.py", line 68, in run
    self.train(shared_lists) # Trains centroids from selected passages
  File "/home/jovyan/colbert-prune/experiments/ColBERTv2-reproduce/ColBERT/colbert/indexing/collection_indexer.py", line 237, in train
    bucket_cutoffs, bucket_weights, avg_residual = self._compute_avg_residual(centroids, heldout)
  File "/home/jovyan/colbert-prune/experiments/ColBERTv2-reproduce/ColBERT/colbert/indexing/collection_indexer.py", line 315, in _compute_avg_residual
    compressor = ResidualCodec(config=self.config, centroids=centroids, avg_residual=None)
  File "/home/jovyan/colbert-prune/experiments/ColBERTv2-reproduce/ColBERT/colbert/indexing/codecs/residual.py", line 61, in __init__
    x = (i >> (j - self.nbits)) & mask
ValueError: negative shift count

nbits = 16 and k = 1000 seem to be correct configuration of the ColBERTv2 paper. How can I reproduce the experiment?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant