Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about the result of train #14

Open
xuxiaoxxxx opened this issue May 8, 2023 · 0 comments
Open

about the result of train #14

xuxiaoxxxx opened this issue May 8, 2023 · 0 comments

Comments

@xuxiaoxxxx
Copy link

xuxiaoxxxx commented May 8, 2023

Thanks for your great work!
I run the evaluation of the trained model I trained on a single NVIDIA V100 and got the following outputs:

stats:
unique | not_in_others: 1793
unique | in_others: 52
unique | overall: 1845
multiple | not_in_others: 4382
multiple | in_others: 3281
multiple | overall: 7663
overall | not_in_others: 6175
overall | in_others: 3333
overall | overall: 9508

unique:
unique | not_in_others | ref_acc: 0.12325711098717233
unique | not_in_others | [email protected]: 0.7702175125488009
unique | not_in_others | [email protected]: 0.6068042387060792
unique | in_others | ref_acc: 0.23076923076923078
unique | in_others | [email protected]: 0.75
unique | in_others | [email protected]: 0.5769230769230769
unique | overall | ref_acc: 0.12628726287262873
unique | overall | [email protected]: 0.7696476964769647
unique | overall | [email protected]: 0.6059620596205962

multiple:
multiple | not_in_others | ref_acc: 0.07644910999543587
multiple | not_in_others | [email protected]: 0.3039707895937928
multiple | not_in_others | [email protected]: 0.2405294386125057
multiple | in_others | ref_acc: 0.21578786955196586
multiple | in_others | [email protected]: 0.42487046632124353
multiple | in_others | [email protected]: 0.2883267296555928
multiple | overall | ref_acc: 0.1361085736656662
multiple | overall | [email protected]: 0.35573535168993864
multiple | overall | [email protected]: 0.26099438862064467

overall:
overall | not_in_others | ref_acc: 0.09004048582995951
overall | not_in_others | [email protected]: 0.4393522267206478
overall | not_in_others | [email protected]: 0.3468825910931174
overall | in_others | ref_acc: 0.21602160216021601
overall | in_others | [email protected]: 0.42994299429942995
overall | in_others | [email protected]: 0.29282928292829286
overall | overall | ref_acc: 0.13420277660917124
overall | overall | [email protected]: 0.43605384938998737
overall | overall | [email protected]: 0.32793437105595286

language classification accuracy: 0.8964105985627067

The result seems much lower than the paper.
My config file was kept same as default.yaml as follows:

GENERAL:
  manual_seed: 3407
  tag: default
  gpu: '1'
  debug: False
  distribute: False

PATH:
  root_path: '/data/code/3dvg/3D-SPS'
  scannet_data_folder: '###/scannet_data'
  scanref_data_root: '/data/dataset/scanrefer'

DATA:
  dataset: ScanRefer
  num_points: 40000
  num_scenes: -1
  num_classes: 20
  use_augment: False
  max_num_obj: 128

  # input
  use_height: False
  use_color: True
  use_normal: True
  use_multiview: True
  fuse_multi_mode: late   # early or late


  # label
  det_class_label: main_with_others    # all, main_with_others, main

MODEL:
  # general
  dropout: 0.1
  use_checkpoint: False

  # point backbone
  point_feat_dim: 288

  # visual feature
  vis_feat_dim: 128
  
  # sampling
  sampling: kpsa-lang-filter
  num_proposal: 512
  kps_fusion_dim: 256
  use_ref_score_loss: True
  use_context_label: False
  ref_use_obj_mask: True
  
  # Head
  size_cls_agnostic: False
  use_objectness: True

  # Language Module
  lang_emb_type: clip
  max_des_len: 77
  word_erase: 0.1
  #embedding_size: 300
  #gru_hidden_size: 256
  #gru_num_layer: 1
  #use_bidir: False # bi-directional GRU

  # Transformer
  model: 'TransformerFilter'
  num_decoder_layers: 5
  object_position_embedding: loc_learned
  point_position_embedding: xyz_learned
  lang_position_embedding: none
  transformer_feat_dim: 384
  ffn_dim: 2048
  n_head: 4
  transformer_dropout: 0.05
  use_ref_mask: False
  use_att_score: True
  ref_filter_steps: [1,2,3,4]
  ref_mask_scale: 0.5
  transformer_mode: serial

  # pretrain
  use_pretrained: True
  pretrain_path: '/data/code/3dvg/3D-SPS/data/xyz_rgb_norm_backbone.pth'
  trans_pre_model: False

LOSS:
  # ----- Refer -----
  no_detection: False
  no_reference: False
  no_lang_cls: False
  ref_each_stage: True
  cls_each_stage: True
  ref_criterion: rank
  # ----- Detection -----
  kps_topk: 5
  kps_loss_weight: 0.8
  det_loss_weight: 5
  center_delta: 0.04
  size_delta: 0.111111111111
  heading_delta: 1

TRAIN:
  batch_size: 24
  num_workers: 0
  epoch: 32

  lr: 0.001
  decoder_lr: 0.0001
  det_decoder_lr: False
  lr_decay_step: [16, 24, 28]
  lr_decay_rate: 0.1
  bn_decay_step: 10
  bn_decay_rate: 0.1
  bn_momentum_init: 0.2
  bn_momentum_min: 0.001
  wd: 0.0005

  verbose: 20       # iter num to ouput log in shell
  val_freq: 2       # epoch num to val
  eval_det: True
  eval_ref: True
  iou_ref_th: 0.25
  iou_ref_topk : 4

Could you please help me with this problem? Or can you give your training log? Thanks for your sharing and reply!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant