You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Both use the sevir dataset to predict for two hours
Using the adamw optimizer with a learning rate of 1e-4, none of them used the task learning rate optimization strategy
Here are the earthformer model parameter settings:
base_units=128,
block_units=None,
scale_alpha=1.0,
num_heads=4,
attn_drop=0.0,
proj_drop=0.0,
ffn_drop=0.0,
Both use the sevir dataset to predict for two hours
Using the adamw optimizer with a learning rate of 1e-4, none of them used the task learning rate optimization strategy
Here are the earthformer model parameter settings:
base_units=128,
block_units=None,
scale_alpha=1.0,
num_heads=4,
attn_drop=0.0,
proj_drop=0.0,
ffn_drop=0.0,
inter-attn downsample/upsample
downsample=2,
downsample_type='patch_merge',
upsample_type="upsample",
upsample_kernel_size=3,
encoder
enc_depth=[2, 2],
enc_attn_patterns=None,
enc_cuboid_size=[(4, 4, 4), (4, 4, 4)],
enc_cuboid_strategy=[('l', 'l', 'l'), ('d', 'd', 'd')],
enc_shift_size=[(0, 0, 0), (0, 0, 0)],
enc_use_inter_ffn=True,
decoder
dec_depth=[2, 2],
dec_cross_start=0,
dec_self_attn_patterns=None,
dec_self_cuboid_size=[(4, 4, 4), (4, 4, 4)],
dec_self_cuboid_strategy=[('l', 'l', 'l'), ('d', 'd', 'd')],
dec_self_shift_size=[(1, 1, 1), (0, 0, 0)],
dec_cross_attn_patterns=None,
dec_cross_cuboid_hw=[(4, 4), (4, 4)],
dec_cross_cuboid_strategy=[('l', 'l', 'l'), ('d', 'l', 'l')],
dec_cross_shift_hw=[(0, 0), (0, 0)],
dec_cross_n_temporal=[1, 2],
dec_cross_last_n_frames=None,
dec_use_inter_ffn=True,
dec_hierarchical_pos_embed=False,
global vectors
num_global_vectors=8,
use_dec_self_global=False,
dec_self_update_global=True,
use_dec_cross_global=False,
use_global_vector_ffn=False,
use_global_self_attn=True,
separate_global_qkv=True,
global_dim_ratio=1,
z_init_method='zeros',
# initial downsample and final upsample
initial_downsample_type="stack_conv",
initial_downsample_activation="leaky",
initial_downsample_type=="conv"
initial_downsample_scale=1,
initial_downsample_conv_layers=2,
final_upsample_conv_layers=2,
initial_downsample_type == "stack_conv"
initial_downsample_stack_conv_num_layers=3,
initial_downsample_stack_conv_dim_list=[16, 64, 128], # [96, 384, 768]
initial_downsample_stack_conv_downscale_list=[3, 2, 2],
initial_downsample_stack_conv_num_conv_list=[2, 2, 2],
# end of initial downsample and final upsample
ffn_activation='gelu',
gated_ffn=False,
norm_layer='layer_norm',
padding_type='ignore',
pos_embed_type='t+hw',
checkpoint_level=0,
use_relative_pos=True,
self_attn_use_final_proj=True,
dec_use_first_self_attn=False,
initialization
attn_linear_init_mode="0",
ffn_linear_init_mode="0",
conv_init_mode="0",
down_up_linear_init_mode="0",
norm_init_mode="0",
The text was updated successfully, but these errors were encountered: