Hyperparameters of MLP part should be changed, if it refers to the paper #22

jhpjhp1118 · 2023-12-07T00:20:25Z

I appreciate for your code :)

I want to suggest an issue about hyperparameters.
I think, according to the paper, hyperparameters of MLP part should be changed.

According to Appendix B of the paper, "mlp_hidden_mults" is multiplied to "input_size",
and "l" is shared embedding dimension

The code should be changed as below. (class TabTransformer() - def init())

[Original code]

input_size = (dim * self.num_categories) + num_continuous
l = input_size // 8

hidden_dimensions = list(map(lambda t: l * t, mlp_hidden_mults))

[Modified code]

input_size = (dim * self.num_categories) + num_continuous
l = dim // 8 # to be used shared embedding

hidden_dimensions = list(map(lambda t: input_size * t, mlp_hidden_mults))

I think it could be very confusing because the author of the paper used two kinds of "l" parameters (size of the input & dimension of shared embedding)

Other person already created issue about shared embedding, so the code should be modified considering this issue too.
#12

Please check whether my opinion is correct or not.

Thank you.

lucidrains · 2023-12-07T01:14:46Z

hey Joon-hyuk! thanks for raising this issue

do you want to see if 43ecec9 takes care of your issue, as well as lack of shared embedding feature?

jhpjhp1118 · 2023-12-07T02:23:49Z

Dear lucidrains,

I checked.
Your commit is better than my requested commit for this whole project 👍

Thank you for adressing this issue. 😄

jhpjhp1118 mentioned this issue Dec 7, 2023

Update hyperparameters of MLP & shared embedding #23

Closed

lucidrains added a commit that referenced this issue Dec 7, 2023

address #22, and also take care of shared embedding

43ecec9

jhpjhp1118 closed this as completed Dec 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hyperparameters of MLP part should be changed, if it refers to the paper #22

Hyperparameters of MLP part should be changed, if it refers to the paper #22

jhpjhp1118 commented Dec 7, 2023

lucidrains commented Dec 7, 2023

jhpjhp1118 commented Dec 7, 2023

Hyperparameters of MLP part should be changed, if it refers to the paper #22

Hyperparameters of MLP part should be changed, if it refers to the paper #22

Comments

jhpjhp1118 commented Dec 7, 2023

lucidrains commented Dec 7, 2023

jhpjhp1118 commented Dec 7, 2023