Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about the input shape of 'CuboidTransformerEncoder' in cuboid_transformer.py #61

Open
paradigm21c opened this issue Nov 24, 2023 · 1 comment

Comments

@paradigm21c
Copy link

paradigm21c commented Nov 24, 2023

I am currently working with multiband (4 band) satellite images (B, T, H, W, C=4) and I have a question regarding the code in line 2930:

In line 2930:
self.encoder = CuboidTransformerEncoder(
input_shape=(T_in, H_in, W_in, base_units), ...

I was under the impression that the shape for 'CuboidTransformerEncoder' is T, H, W, C, as mentioned in line 1696. Could you clarify the underlying logic for the difference in shape parameters between these two lines?

This difference caused an error in line 1898
assert (T, H, W, C_in) == self.input_shape

Should I follow cuboid_transformer_unet_dec.py ?

@gaozhihan
Copy link
Contributor

Thank you for your question. The input shape asserted in CuboidTransformerEncoder in line 1898


should match the shape of x in line 3184
mem_l, mem_global_vector_l = self.encoder(x, init_global_vectors)

which has been downsampled by the initial downsampler.

You may want to refer to the simplest test case to verify if the shapes are aligned correctly. It should work for multi-channel inputs according to my testing. You can modify input_shape and target_shape in line 24,25 for your own use. Please note that this test script is from my fork, which has not been merged into this repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants