Question about num_timestep #12

carolineys · 2024-06-04T12:22:11Z

In the code, there is this "num_timesteps" in the constructor of the ReplayMemoryDataset class. Does this "num_timesteps" correspond to the concept of "window" in the paper? In my understanding, the q function proposed in the paper only takes states within the window range (s_{t-w} to s_t) and feed them into the transformer instead of using all of the previous states. Is this interpretation correct? I am confused because in the code, the default "num_timesteps" is 1, which, in my understanding won't encode any sequential info.

lucidrains · 2024-06-11T13:49:28Z

@YifeiChen777 hi Caroline, yes you need to set that to greater than 1 for the autoregressive Q-learning

yuriy10139 · 2024-07-31T06:30:48Z

in autoregressive_q_learn() method of QLearner the time steps dimension is moved into batch dimension. E.g. if we have a batch of 16 samples with 3 timesteps each - this will be converted into a batch of 48.

    # fold time steps into batch
    states, time_ps = pack_one(states, "* c f h w")
    actions, _ = pack_one(actions, "* n")

It looks like all the next learning does not take into account history and treats each element of a batch independently because attention mechanism seems to not span across the batch dimension. Thus, if every item in a batch attends only to itself (and cross-attends to encoded_state but still within a single item of a batch), the model does not see and will not recognize any inter-timestep dependencies. Please correct me if miss something?

Johnly1986 · 2024-08-15T03:07:02Z

When I change num_timestep = 50, a problem of tensor size mismatch occurs
RuntimeError: The size of tensor a (200) must match the size of tensor b (10000) at non-singleton dimension 0

lucidrains · 2024-08-15T04:00:26Z

@Johnly1986 could you try again on the latest version?

Johnly1986 · 2024-08-15T06:59:52Z

I used the latest version, and the tensor size mismatch caused by num_timestep no longer exists, but it automatically exits after running for a while.

I found out that the sudden exit was because of memory explosion, it used 30GB of memory..
Now I want to know where the memory is being consumed and hope it can run on the GPU

lucidrains · 2024-08-16T17:05:55Z

in autoregressive_q_learn() method of QLearner the time steps dimension is moved into batch dimension. E.g. if we have a batch of 16 samples with 3 timesteps each - this will be converted into a batch of 48.
    # fold time steps into batch
    states, time_ps = pack_one(states, "* c f h w")
    actions, _ = pack_one(actions, "* n")
It looks like all the next learning does not take into account history and treats each element of a batch independently because attention mechanism seems to not span across the batch dimension. Thus, if every item in a batch attends only to itself (and cross-attends to encoded_state but still within a single item of a batch), the model does not see and will not recognize any inter-timestep dependencies. Please correct me if miss something?

hi, yes this is correct afaict.

this is why in the todo section in the readme i have written improvise cross attention to past actions and states of timestep, transformer-xl fashion (w/ structured memory dropout)

are you able to get things working with just single frames? i'm happy to invest some time building out transformer-xl component for you if you have everything setup, and willing to share your experimental results

lucidrains · 2024-08-16T17:07:21Z

@yuriy10139 also, you should reach out to @2M-kotb, as i think he was playing around with the repo for his research some time back

yuriy10139 · 2024-08-19T12:03:35Z

in autoregressive_q_learn() method of QLearner the time steps dimension is moved into batch dimension. E.g. if we have a batch of 16 samples with 3 timesteps each - this will be converted into a batch of 48.
    # fold time steps into batch
    states, time_ps = pack_one(states, "* c f h w")
    actions, _ = pack_one(actions, "* n")
It looks like all the next learning does not take into account history and treats each element of a batch independently because attention mechanism seems to not span across the batch dimension. Thus, if every item in a batch attends only to itself (and cross-attends to encoded_state but still within a single item of a batch), the model does not see and will not recognize any inter-timestep dependencies. Please correct me if miss something?
hi, yes this is correct afaict.

this is why in the todo section in the readme i have written improvise cross attention to past actions and states of timestep, transformer-xl fashion (w/ structured memory dropout)

are you able to get things working with just single frames? i'm happy to invest some time building out transformer-xl component for you if you have everything setup, and willing to share your experimental results

I've prepared a small demo in a separate repository https://github.com/yuriy10139/q-transfromer-maniskill-demo
Currently a bit short of compute, so I have tried it just once on my laptop without any hyperparameter search and on 112x112 image from a single camera (Maniskill env default is 128x128, so I guess that is not that bad).

On 50 eval runs the model shows 4.054, 4.647, 4.568 of average reward for the 4000-step, 7000-step and 10000-step checkpoints respectively, so probably it learns something, but still quite far from well-trained.

If you'll manage to add history-based learning, I hope to find more GPU time to test it with bigger resolution and for more timesteps.

lucidrains · 2024-08-19T13:23:37Z

@yuriy10139 thank you! i'll add a few things soon

yuriy10139 · 2024-09-12T09:09:41Z

Hi @lucidrains, was the code of demo setup any helpful?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about num_timestep #12

Question about num_timestep #12

carolineys commented Jun 4, 2024 •

edited

Loading

lucidrains commented Jun 11, 2024

yuriy10139 commented Jul 31, 2024 •

edited

Loading

Johnly1986 commented Aug 15, 2024

lucidrains commented Aug 15, 2024

Johnly1986 commented Aug 15, 2024 •

edited

Loading

lucidrains commented Aug 16, 2024 •

edited

Loading

lucidrains commented Aug 16, 2024

yuriy10139 commented Aug 19, 2024 •

edited

Loading

lucidrains commented Aug 19, 2024

yuriy10139 commented Sep 12, 2024

Question about num_timestep #12

Question about num_timestep #12

Comments

carolineys commented Jun 4, 2024 • edited Loading

lucidrains commented Jun 11, 2024

yuriy10139 commented Jul 31, 2024 • edited Loading

Johnly1986 commented Aug 15, 2024

lucidrains commented Aug 15, 2024

Johnly1986 commented Aug 15, 2024 • edited Loading

lucidrains commented Aug 16, 2024 • edited Loading

lucidrains commented Aug 16, 2024

yuriy10139 commented Aug 19, 2024 • edited Loading

lucidrains commented Aug 19, 2024

yuriy10139 commented Sep 12, 2024

carolineys commented Jun 4, 2024 •

edited

Loading

yuriy10139 commented Jul 31, 2024 •

edited

Loading

Johnly1986 commented Aug 15, 2024 •

edited

Loading

lucidrains commented Aug 16, 2024 •

edited

Loading

yuriy10139 commented Aug 19, 2024 •

edited

Loading