Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tanh while calculating attention scores #7

Open
gordicaleksa opened this issue Jan 24, 2021 · 1 comment
Open

tanh while calculating attention scores #7

gordicaleksa opened this issue Jan 24, 2021 · 1 comment

Comments

@gordicaleksa
Copy link

Hey! I was interested into why are you using tanh here:

attn_src = torch.matmul(F.tanh(h_prime), self.a_src) # bs x n_head x n x 1

in BatchMultiHeadGraphAttention, get_layers.py. Did it stabilize the training? Is it some form of feature normalization?

@xptree
Copy link
Owner

xptree commented Jan 28, 2021

Thanks for pointing our this.

Yes, in the original GAT paper, they don't have the tanh activation. But we found that it helps our training a little.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants