No objects to concatenate in clustering.fit #54

mrbarbitoff · 2024-04-02T11:49:19Z

Hi!

After updating to the latest release of clusTCR, I am facing an issue while attempting to fit the clustering to data (please see the complete traceback below). The same functions worked perfectly with the previous version. I initialize the clustering object like clustering = Clustering(n_cpus=24, chain='A') (though the same error occurs if I don't specify the chain, both for TRA and TRB input data). I'd be grateful for your help with this issue.

ValueError                                Traceback (most recent call last)
Cell In[9], line 1
----> 1 output = clustering.fit(tra_data, include_vgene = True, 
      2                         cdr3_col="aaSeqCDR3", 
      3                         v_gene_col="vGene")

File ~/anaconda3/envs/clustcr_103/lib/python3.10/site-packages/clustcr/clustering/tools.py:96, in timeit.<locals>.timed(*args, **kwargs)
     94 def timed(*args, **kwargs):
     95     start = time.time()
---> 96     result = myfunc(*args, **kwargs)
     97     end = time.time()
     98     print(f'Total time to run ClusTCR: {(end-start):.3f}s')

File ~/anaconda3/envs/clustcr_103/lib/python3.10/site-packages/clustcr/clustering/clustering.py:429, in Clustering.fit(self, data, include_vgene, cdr3_col, v_gene_col, alpha)
    425 """
    426 Function that calls the indicated clustering method and returns clusters in a ClusteringResult
    427 """
    428 if include_vgene:
--> 429     return self._vgene_clustering(data, cdr3_col, v_gene_col)
    430 else:
    431     try:

File ~/anaconda3/envs/clustcr_103/lib/python3.10/site-packages/clustcr/clustering/clustering.py:346, in Clustering._vgene_clustering(self, data, cdr3_col, v_gene_col)
    343 super_clusters = self._faiss(subset["junction_aa"])
    344 # Second clustering step
    345 clusters = ClusteringResult(
--> 346     MCL_multiprocessing_from_preclusters(
    347         super_clusters, self.mcl_params, self.n_cpus
    348         ), chain=self.chain
    349                             ).clusters_df
    350 clusters.cluster += c # adjust cluster identifiers to ensure they stay unique
    351 subset = subset.merge(clusters, left_on="junction_aa", right_on="junction_aa")

File ~/anaconda3/envs/clustcr_103/lib/python3.10/site-packages/clustcr/clustering/methods.py:139, in MCL_multiprocessing_from_preclusters(preclust, mcl_hyper, n_cpus)
    137     if c != 0:
    138         nodelist[c]['cluster'] += nodelist[c - 1]['cluster'].max() + 1
--> 139 return pd.concat(nodelist, ignore_index=True)

File ~/anaconda3/envs/clustcr_103/lib/python3.10/site-packages/pandas/core/reshape/concat.py:382, in concat(objs, axis, join, ignore_index, keys, levels, names, verify_integrity, sort, copy)
    379 elif copy and using_copy_on_write():
    380     copy = False
--> 382 op = _Concatenator(
    383     objs,
    384     axis=axis,
    385     ignore_index=ignore_index,
    386     join=join,
    387     keys=keys,
    388     levels=levels,
    389     names=names,
    390     verify_integrity=verify_integrity,
    391     copy=copy,
    392     sort=sort,
    393 )
    395 return op.get_result()

File ~/anaconda3/envs/clustcr_103/lib/python3.10/site-packages/pandas/core/reshape/concat.py:445, in _Concatenator.__init__(self, objs, axis, join, keys, levels, names, ignore_index, verify_integrity, copy, sort)
    442 self.verify_integrity = verify_integrity
    443 self.copy = copy
--> 445 objs, keys = self._clean_keys_and_objs(objs, keys)
    447 # figure out what our result ndim is going to be
    448 ndims = self._get_ndims(objs)

File ~/anaconda3/envs/clustcr_103/lib/python3.10/site-packages/pandas/core/reshape/concat.py:507, in _Concatenator._clean_keys_and_objs(self, objs, keys)
    504     objs_list = list(objs)
    506 if len(objs_list) == 0:
--> 507     raise ValueError("No objects to concatenate")
    509 if keys is None:
    510     objs_list = list(com.not_none(*objs_list))

ValueError: No objects to concatenate

Yury

The text was updated successfully, but these errors were encountered:

svalkiers · 2024-04-03T05:33:54Z

Hi Yury,

Sorry for the inconvenience. I believe this error indicates that your clustering result is empty (i.e. no clusters were detected), hence there is nothing to be concatenated. I will update the script to return a None-type instead.

Another solution would be to loosen up the stringency by only looking at the CDR3 amino acid sequence.

Best,
Sebastian

svalkiers · 2024-04-03T08:11:11Z

The issue should be fixed in the latest build (clustcr-1.0.3+3.g5fa6b46).
Let me know if you encounter any further problems.

Cheers,
Sebastiaan

mrbarbitoff · 2024-04-03T11:47:10Z

Hi @svalkiers

Thank you for your reply! It seems that the lack of clustering results was due to the fact that I occasionally installed the GPU version instead of the regular one during an update. Sorry for that.
The issue is resolved.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No objects to concatenate in clustering.fit #54

No objects to concatenate in clustering.fit #54

mrbarbitoff commented Apr 2, 2024

svalkiers commented Apr 3, 2024

svalkiers commented Apr 3, 2024

mrbarbitoff commented Apr 3, 2024

No objects to concatenate in clustering.fit #54

No objects to concatenate in clustering.fit #54

Comments

mrbarbitoff commented Apr 2, 2024

svalkiers commented Apr 3, 2024

svalkiers commented Apr 3, 2024

mrbarbitoff commented Apr 3, 2024