Dataset | architecture | depth | init | clips x crops | #frames x sampling rate | acc@1 | acc@5 | checkpoint | config |
---|---|---|---|---|---|---|---|---|---|
K400 | TAda2D | R50 | IN-1K | 10 x 3 | 8 x 8 | 76.7 | 92.6 | [google drive][baidu(code:p06d)] | tada2d_8x8.yaml |
K400 | TAda2D | R50 | IN-1K | 10 x 3 | 16 x 5 | 77.4 | 93.1 | [google drive][baidu(code:6k8h)] | tada2d_16x5.yaml |
K400 | ViViT Fact. Enc. | B16x2 | IN-21K | 4 x 3 | 32 x 2 | 79.4 | 94.0 | [google drive][baidu(code:1t51)] | vivit_fac_enc_b16x2.yaml |
Dataset | architecture | depth | init | clips x crops | #frames | acc@1 | acc@5 | checkpoint | config |
---|---|---|---|---|---|---|---|---|---|
SSV2 | TAda2D | R50 | IN-1K | 2 x 3 | 8 | 64.2 | 88.0 | [google drive][baidu(code:dlil)] | tada2d_8f.yaml |
SSV2 | TAda2D | R50 | IN-1K | 2 x 3 | 16 | 65.6 | 89.1 | [google drive][baidu(code:f857)] | tada2d_16f.yaml |
architecture | init | resolution | clips x crops | #frames x sampling rate | action acc@1 | verb acc@1 | noun acc@1 | checkpoint | config |
---|---|---|---|---|---|---|---|---|---|
ViViT Fact. Enc.-B16x2 | K700 | 320 | 4 x 3 | 32 x 2 | 46.3 | 67.4 | 58.9 | [google drive][baidu(code:rinh)] | vivit_fac_enc.yaml |
ir-CSN-R152 | K700 | 224 | 10 x 3 | 32 x 2 | 44.5 | 68.4 | 55.9 | [google drive][baidu(code:s0uj)] | csn.yaml |
feature | classification | type | [email protected] | [email protected] | [email protected] | [email protected] | [email protected] | Avg | checkpoint | config |
---|---|---|---|---|---|---|---|---|---|---|
ViViT | ViViT | Verb | 22.90 | 21.93 | 20.74 | 19.08 | 16.00 | 20.13 | [google drive][baidu(code:3sud)] | vivit-os-local.yaml |
ViViT | ViViT | Noun | 28.95 | 27.38 | 25.52 | 22.67 | 18.95 | 24.69 | [google drive][baidu(code:3sud)] | vivit-os-local.yaml |
ViViT | ViViT | Action | 20.82 | 19.93 | 18.67 | 17.02 | 15.06 | 18.30 | [google drive][baidu(code:3sud)] | vivit-os-local.yaml |
TAda2D | TAda2D | Verb | 19.70 | 18.49 | 17.41 | 15.50 | 12.78 | 16.78 | [google drive][baidu(code:d01j)] | - |
TAda2D | TAda2D | Noun | 20.54 | 19.32 | 17.94 | 15.77 | 13.39 | 17.39 | [google drive][baidu(code:d01j)] | - |
TAda2D | TAda2D | Action | 15.15 | 14.32 | 13.59 | 12.18 | 10.65 | 13.18 | [google drive][baidu(code:d01j)] | - |
Note: for the following models, decord 0.4.1 are used rather than the default 0.6.0 for the codebase.
dataset | backbone | checkpoint | config |
---|---|---|---|
HMDB51 | R-2D3D-18 | [google drive][baidu(code:ahqg)] | papers/CVPR2021-MOSI/config/MoSI_r2d3d_hmdb.py |
HMDB51 | R(2+1)D-10 | [google drive][baidu(code:1ktb)] | papers/CVPR2021-MOSI/config/MoSI_r2p1d_hmdb.py |
dataset | backbone | acc@1 | acc@5 | checkpoint | config |
---|---|---|---|---|---|
HMDB51 | R-2D3D-18 | 46.93 | 74.71 | [google drive][baidu(code:2puu)] | papers/CVPR2021-MOSI/config/Finetune_r2d3d_hmdb.py |
HMDB51 | R(2+1)D-10 | 51.83 | 78.63 | [google drive][baidu(code:hgnc)] | papers/CVPR2021-MOSI/config/Finetune_r2p1d_hmdb.py |