Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Actual usage of reconstruction loss on initialization stage #4

Open
the-jb opened this issue Jan 23, 2024 · 1 comment
Open

Actual usage of reconstruction loss on initialization stage #4

the-jb opened this issue Jan 23, 2024 · 1 comment

Comments

@the-jb
Copy link

the-jb commented Jan 23, 2024

Dear authors.

I was very impressed with your amazing work, which enables training dynamic document clustering through tokenization process.
I have carefully read your paper and the code to understand the details correctly but still have a question about the intent of reconstruction loss on initialization stage.

First, in your paper, it is said for building reconstruction loss $\mathcal L_{Rec}$ as:

warm-up the model by passing the continuous representation $d_T$ to the reconstruction model instead of the docid
representation $z_T$

But, I found that the code actually calculates reconstruction loss as below, and I cannot find any replacement described above in the code:

GenRet/run.py

Lines 403 to 404 in d3c1609

cl_dd_loss = OurTrainer.compute_contrastive_loss(
quant_doc_embeds + doc_embeds - doc_embeds.detach(), doc_embeds.detach(), gathered=False) # reconstruction

Furthermore, in main() function, the loss_w is configured as 1:

GenRet/run.py

Line 1154 in d3c1609

config['loss_w'] = 1
, which is the pre-configured weight that exclude the reconstruction loss as below.

GenRet/run.py

Lines 687 to 688 in d3c1609

w_1 = {'cl_loss': 0.5, 'all_cl_loss': 0, 'ce_loss': 0, 'code_loss': 0.5, 'aux_code_loss': 0, 'mse_loss': 0,
'cl_dd_loss': 0, 'clb_loss': 0}

In summary, I found some inconsistency between the paper and the code in calculating reconstruction loss on codebook initialization stage.
If I understand correctly, the reconstruction is actually not being used on the initialization phase, is this correct?

Thanks for sharing your wonderful work.
Best regards,

JB

@sunnweiwei
Copy link
Owner

Hi,

Thank you for pointing out this issue. The code is inconsistent with the original paper in this part. Some analysis shows that with the query-doc contrastive loss, the model outputs document embeddings used for initialization; and the doc-doc contrastive loss (reconstruction loss) may be less useful and is omitted.

Best, Weiwei

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants