Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about num_timestep #12

Open
carolineys opened this issue Jun 4, 2024 · 10 comments
Open

Question about num_timestep #12

carolineys opened this issue Jun 4, 2024 · 10 comments

Comments

@carolineys
Copy link

carolineys commented Jun 4, 2024

In the code, there is this "num_timesteps" in the constructor of the ReplayMemoryDataset class. Does this "num_timesteps" correspond to the concept of "window" in the paper? In my understanding, the q function proposed in the paper only takes states within the window range (s_{t-w} to s_t) and feed them into the transformer instead of using all of the previous states. Is this interpretation correct? I am confused because in the code, the default "num_timesteps" is 1, which, in my understanding won't encode any sequential info.

@lucidrains
Copy link
Owner

@YifeiChen777 hi Caroline, yes you need to set that to greater than 1 for the autoregressive Q-learning

@yuriy10139
Copy link

yuriy10139 commented Jul 31, 2024

in autoregressive_q_learn() method of QLearner the time steps dimension is moved into batch dimension. E.g. if we have a batch of 16 samples with 3 timesteps each - this will be converted into a batch of 48.

    # fold time steps into batch
    states, time_ps = pack_one(states, "* c f h w")
    actions, _ = pack_one(actions, "* n")

It looks like all the next learning does not take into account history and treats each element of a batch independently because attention mechanism seems to not span across the batch dimension. Thus, if every item in a batch attends only to itself (and cross-attends to encoded_state but still within a single item of a batch), the model does not see and will not recognize any inter-timestep dependencies. Please correct me if miss something?

@Johnly1986
Copy link

When I change num_timestep = 50, a problem of tensor size mismatch occurs
RuntimeError: The size of tensor a (200) must match the size of tensor b (10000) at non-singleton dimension 0

@lucidrains
Copy link
Owner

@Johnly1986 could you try again on the latest version?

@Johnly1986
Copy link

Johnly1986 commented Aug 15, 2024

image
I used the latest version, and the tensor size mismatch caused by num_timestep no longer exists, but it automatically exits after running for a while.

I found out that the sudden exit was because of memory explosion, it used 30GB of memory..
Now I want to know where the memory is being consumed and hope it can run on the GPU

@lucidrains
Copy link
Owner

lucidrains commented Aug 16, 2024

in autoregressive_q_learn() method of QLearner the time steps dimension is moved into batch dimension. E.g. if we have a batch of 16 samples with 3 timesteps each - this will be converted into a batch of 48.

    # fold time steps into batch
    states, time_ps = pack_one(states, "* c f h w")
    actions, _ = pack_one(actions, "* n")

It looks like all the next learning does not take into account history and treats each element of a batch independently because attention mechanism seems to not span across the batch dimension. Thus, if every item in a batch attends only to itself (and cross-attends to encoded_state but still within a single item of a batch), the model does not see and will not recognize any inter-timestep dependencies. Please correct me if miss something?

hi, yes this is correct afaict.

this is why in the todo section in the readme i have written improvise cross attention to past actions and states of timestep, transformer-xl fashion (w/ structured memory dropout)

are you able to get things working with just single frames? i'm happy to invest some time building out transformer-xl component for you if you have everything setup, and willing to share your experimental results

@lucidrains
Copy link
Owner

@yuriy10139 also, you should reach out to @2M-kotb, as i think he was playing around with the repo for his research some time back

@yuriy10139
Copy link

yuriy10139 commented Aug 19, 2024

in autoregressive_q_learn() method of QLearner the time steps dimension is moved into batch dimension. E.g. if we have a batch of 16 samples with 3 timesteps each - this will be converted into a batch of 48.

    # fold time steps into batch
    states, time_ps = pack_one(states, "* c f h w")
    actions, _ = pack_one(actions, "* n")

It looks like all the next learning does not take into account history and treats each element of a batch independently because attention mechanism seems to not span across the batch dimension. Thus, if every item in a batch attends only to itself (and cross-attends to encoded_state but still within a single item of a batch), the model does not see and will not recognize any inter-timestep dependencies. Please correct me if miss something?

hi, yes this is correct afaict.

this is why in the todo section in the readme i have written improvise cross attention to past actions and states of timestep, transformer-xl fashion (w/ structured memory dropout)

are you able to get things working with just single frames? i'm happy to invest some time building out transformer-xl component for you if you have everything setup, and willing to share your experimental results

I've prepared a small demo in a separate repository https://github.com/yuriy10139/q-transfromer-maniskill-demo
Currently a bit short of compute, so I have tried it just once on my laptop without any hyperparameter search and on 112x112 image from a single camera (Maniskill env default is 128x128, so I guess that is not that bad).

On 50 eval runs the model shows 4.054, 4.647, 4.568 of average reward for the 4000-step, 7000-step and 10000-step checkpoints respectively, so probably it learns something, but still quite far from well-trained.

If you'll manage to add history-based learning, I hope to find more GPU time to test it with bigger resolution and for more timesteps.

@lucidrains
Copy link
Owner

@yuriy10139 thank you! i'll add a few things soon

@yuriy10139
Copy link

Hi @lucidrains, was the code of demo setup any helpful?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants