Actual usage of reconstruction loss on initialization stage #4

the-jb · 2024-01-23T08:05:26Z

Dear authors.

I was very impressed with your amazing work, which enables training dynamic document clustering through tokenization process.
I have carefully read your paper and the code to understand the details correctly but still have a question about the intent of reconstruction loss on initialization stage.

First, in your paper, it is said for building reconstruction loss $\mathcal L_{Rec}$ as:

warm-up the model by passing the continuous representation $d_T$ to the reconstruction model instead of the docid
representation $z_T$

But, I found that the code actually calculates reconstruction loss as below, and I cannot find any replacement described above in the code:

GenRet/run.py

Lines 403 to 404 in d3c1609

    
           cl_dd_loss = OurTrainer.compute_contrastive_loss( 
        
               quant_doc_embeds + doc_embeds - doc_embeds.detach(), doc_embeds.detach(), gathered=False)  # reconstruction

Furthermore, in main() function, the loss_w is configured as 1:

GenRet/run.py

Line 1154 in d3c1609

config['loss_w'] = 1

, which is the pre-configured weight that exclude the reconstruction loss as below.

GenRet/run.py

Lines 687 to 688 in d3c1609

    
           w_1 = {'cl_loss': 0.5, 'all_cl_loss': 0, 'ce_loss': 0, 'code_loss': 0.5, 'aux_code_loss': 0, 'mse_loss': 0, 
        
                  'cl_dd_loss': 0, 'clb_loss': 0}

In summary, I found some inconsistency between the paper and the code in calculating reconstruction loss on codebook initialization stage.
If I understand correctly, the reconstruction is actually not being used on the initialization phase, is this correct?

Thanks for sharing your wonderful work.
Best regards,

JB

The text was updated successfully, but these errors were encountered:

sunnweiwei · 2024-02-09T14:15:36Z

Hi,

Thank you for pointing out this issue. The code is inconsistent with the original paper in this part. Some analysis shows that with the query-doc contrastive loss, the model outputs document embeddings used for initialization; and the doc-doc contrastive loss (reconstruction loss) may be less useful and is omitted.

Best, Weiwei

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Actual usage of reconstruction loss on initialization stage #4

Actual usage of reconstruction loss on initialization stage #4

the-jb commented Jan 23, 2024

sunnweiwei commented Feb 9, 2024

Actual usage of reconstruction loss on initialization stage #4

Actual usage of reconstruction loss on initialization stage #4

Comments

the-jb commented Jan 23, 2024

sunnweiwei commented Feb 9, 2024