Replace `--eval-mode = ['deterministic', 'stochastic']` with temperature. #74

eihli · 2024-01-21T23:03:04Z

A note from the meeting with Daniel today.

See for an example of temperature calculation: https://github.com/kzl/decision-transformer/blob/e2d82e68f330c00f763507b3b01d774740bee53f/atari/mingpt/utils.py#L47

Right now we're either entirely deterministic (temperature = 0) or entirely stochasticly sampling from the multinomial of the probabilities (temperature = 1). See:

NEKO/gato/policy/gato_policy.py

Line 442 in 119296c

token = torch.argmax(logits, dim=-1)

The change to temperature looks like it will be simple and it would give a lot more flexibility in output.

eihli · 2024-01-30T15:02:34Z

I was just playing around to see what this looked like.

torch.multinomial(torch.nn.functional.softmax(logits / 0.01), num_samples=10)
tensor([837, 262,   0,   1,   6,   7,   5,   4,   2,   3])
torch.multinomial(torch.nn.functional.softmax(logits / 0.01), num_samples=10)
tensor([837, 262,   0,   1,   6,   7,   5,   4,   2,   3])
torch.multinomial(torch.nn.functional.softmax(logits / 0.01), num_samples=10)
tensor([837, 262,   0,   1,   6,   7,   5,   4,   2,   3])

torch.multinomial(torch.nn.functional.softmax(logits / 0.05), num_samples=10)
tensor([  837,   262,   764,  2488,   286, 41917,  6940, 50103,   311, 27596])
torch.multinomial(torch.nn.functional.softmax(logits / 0.05), num_samples=10)
tensor([  837,   262,   764,  2488,   286, 41917, 50103, 27596,   311,  6940])
torch.multinomial(torch.nn.functional.softmax(logits / 0.05), num_samples=10)

tensor([  837,   262,   764,  2488,   286, 27596, 50103,   311, 41917, 14146])
torch.multinomial(torch.nn.functional.softmax(logits / 0.2), num_samples=10)
tensor([  837,   262,   764, 19423,   286, 35619, 12605,  1054,  2488,  8727])
torch.multinomial(torch.nn.functional.softmax(logits / 0.2), num_samples=10)
tensor([  837,   262, 47533,   764,  5907, 17533,   286,  2488,  4858, 44000])
torch.multinomial(torch.nn.functional.softmax(logits / 0.2), num_samples=10)

tensor([  837,   262,   764, 31617,  3327, 46020,   311,  2488, 44000, 35027])
torch.multinomial(torch.nn.functional.softmax(logits / 0.5), num_samples=10)
tensor([44077, 24514, 36254, 46326, 17648, 43767, 22619, 37505, 27588, 30627])
torch.multinomial(torch.nn.functional.softmax(logits / 0.5), num_samples=10)
tensor([40022, 25254, 30907, 15573, 48496, 47162,  8696, 48436, 38703, 16449])
torch.multinomial(torch.nn.functional.softmax(logits / 0.5), num_samples=10)
tensor([25441, 43506, 43061,  9675, 30613, 20674,  8506, 20386, 37007, 24969])

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace `--eval-mode = ['deterministic', 'stochastic']` with temperature. #74

Replace `--eval-mode = ['deterministic', 'stochastic']` with temperature. #74

eihli commented Jan 21, 2024 •

edited

Loading

eihli commented Jan 30, 2024

Replace --eval-mode = ['deterministic', 'stochastic'] with temperature. #74

Replace --eval-mode = ['deterministic', 'stochastic'] with temperature. #74

Comments

eihli commented Jan 21, 2024 • edited Loading

eihli commented Jan 30, 2024

Replace `--eval-mode = ['deterministic', 'stochastic']` with temperature. #74

Replace `--eval-mode = ['deterministic', 'stochastic']` with temperature. #74

eihli commented Jan 21, 2024 •

edited

Loading