Skip to content

Good work, enjoy reading it. And some questions about the deatils in implementation #3

Open
@VPeterV

Description

@VPeterV

Hi! I really like this work. The paper is very precise and readable. But I am still curious about some details about computing potential functions.

  1. To my understanding, if the model learns well, sum_s psi_{st}(y_s,y_t) will be equal to psi_s(y_s) the model learns. I notice that in this implementation, when computing edge's potential function, the denominator is computed by sum_s = torch.sum(logits, dim=2).unsqueeze(2) + eps, sum_t = torch.sum(logits, dim=1).unsqueeze(1) + eps instead of by using pred_node. So here I am curious that have you tested using pred_node instead? If yes, will the performance be sensitive to this?
  2. And I notice here the aforementioned denominator has been scaled by norm_coef. Since I find the denominator will sometimes be a very small value in log-space. I wonder whether the model is sensitive to this hyper-parameter? If yes, do you think it is caused by some numerical stability issues, or just by the model's ability to learn this probability since sometimes the graph is sparse?
    thks in advance :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions