LTMU-H reproduction not performing as well as reported? #3

vineetparikh · 2024-04-01T13:11:11Z

Hi there, thanks so much for the great work and toolkit for future benchmarks!

I'm running the LTMU-H baseline for TREK-150 under the OPE protocol to get some initial understanding of quantitative performance, and I'm finding that SS, NPS, and GSS are significantly lower than what's reported in the paper. I've posted my values below.

I followed the initial guidelines, so my initial thought is that there's something different between my setup and the setup used to run evaluation. Any idea as to what's going on?

vineetparikh · 2024-04-01T13:35:27Z

for context, i'm using the same checkpoints for HOI and STARK as listed in the readme, so I don't know if there's any additional training that needs to be done/if there's another checkpoint that gives the results in the paper

matteo-dunnhofer · 2024-04-02T06:44:29Z

Hi @vineetparikh,

that's strange. I tested the repo multiple times and always had the correct results. No additional training or checkpoint rather than those posted in the README are needed. Maybe something is wrong with frames and annotations? Did you try to run the report method on the precomputed results we provide?

vineetparikh · 2024-04-02T11:27:00Z

Yup, I pre-extracted the frames with the same ffmpeg version and visualized them to make sure the annotations looked good (actually had opened another issue at matteo-dunnhofer/TREK-150-toolkit#5 before fixing it). Where could I find the precomputed results? I basically ran this from scratch and got results that way.

matteo-dunnhofer · 2024-04-02T12:07:38Z

Here are the results: https://uniudamce-my.sharepoint.com/:u:/g/personal/matteo_dunnhofer_uniud_it/EbnWz8FPqetPgXErg1SNNhABeBpTrlMMqKFr6xIxreD6UQ?e=xnC4Xo

vineetparikh · 2024-04-04T01:39:37Z

Hi Matteo, so I pulled the results and specifically focused on evaluating for LTMU-H. Here's the code:

import sys
sys.path.append('./TREK-150-toolkit')

from ltmuh import LTMUH
from toolkit.experiments import ExperimentTREK150

tracker = LTMUH()

root_dir = './TREK-150-toolkit/TREK-150' # set the path to TREK-150's root folder
exp = ExperimentTREK150(root_dir, result_dir='./TREK-150-Dunnhofer-Results', report_dir='./TREK-150-Dunnhofer-Report')
prot = 'ope'

# Run an experiment with the protocol of interest and save results
# exp.run(tracker, protocol=prot, visualize=False)

# Generate a report for the protocol of interest
exp.report([tracker.name], protocol=prot)

I still have results for LTMU-H that are lower than the results in the report. Here's the success plot, NP plot, and GSR plot

vineetparikh · 2024-04-04T01:42:01Z

For some reason I can't attach the YAML file for my conda env, so I'll post it as plaintext here but this should be import-able:

name: ltmuh
channels:
  - conda-forge
  - huggingface
  - iopath
  - pytorch
  - defaults
dependencies:
  - _libgcc_mutex=0.1=main
  - _openmp_mutex=5.1=1_gnu
  - ca-certificates=2022.4.26=h06a4308_0
  - certifi=2021.5.30=py36h06a4308_0
  - ld_impl_linux-64=2.38=h1181459_1
  - libffi=3.3=he6710b0_2
  - libgcc-ng=11.2.0=h1234567_1
  - libgomp=11.2.0=h1234567_1
  - libstdcxx-ng=11.2.0=h1234567_1
  - ncurses=6.3=h7f8727e_2
  - openssl=1.1.1o=h7f8727e_0
  - pip=21.2.2=py36h06a4308_0
  - python=3.6.13=h12debd9_1
  - readline=8.1.2=h7f8727e_1
  - setuptools=58.0.4=py36h06a4308_0
  - sqlite=3.38.3=hc218d9a_0
  - tk=8.6.12=h1ccaba5_0
  - wheel=0.37.1=pyhd3eb1b0_0
  - xz=5.2.5=h7f8727e_1
  - zlib=1.2.12=h7f8727e_2
  - pip:
    - cffi==1.15.0
    - cycler==0.11.0
    - cython==0.29.30
    - dataclasses==0.8
    - easydict==1.9
    - fire==0.4.0
    - future==0.18.2
    - got10k==0.1.3
    - importlib-resources==5.4.0
    - jinja2==3.0.3
    - joblib==1.1.0
    - jpeg4py==0.1.4
    - kiwisolver==1.3.1
    - lmdb==1.3.0
    - markupsafe==2.0.1
    - matplotlib==3.3.4
    - msgpack==1.0.4
    - numpy==1.19.5
    - opencv-python==4.6.0.66
    - pascal-voc-writer==0.1.4
    - pillow==8.4.0
    - protobuf==3.19.4
    - pycparser==2.21
    - pyparsing==3.0.9
    - python-dateutil==2.8.2
    - pyyaml==5.3.1
    - scikit-learn==0.24.2
    - scipy==1.2.1
    - shapely==1.8.4
    - six==1.16.0
    - sklearn==0.0
    - tensorboardx==2.5.1
    - termcolor==1.1.0
    - threadpoolctl==3.1.0
    - timm==0.3.2
    - torch==1.4.0
    - torchvision==0.5.0
    - tqdm==4.19.9
    - typing-extensions==4.1.1
    - wget==3.2
    - yacs==0.1.8
    - zipp==3.6.0

vineetparikh · 2024-04-04T01:42:22Z

Any idea as to what's going on?

matteo-dunnhofer · 2024-04-04T09:25:00Z

I tried again but I still obtain the correct results. The yaml looks good. There might be something wrong with the annotation files. Send me an e-mail to [email protected] and I will share a different version.

vineetparikh · 2024-04-04T13:35:23Z

email sent! I'm additionally still confused on why my reproduced results are different from the ones in the link, but I guess we can take this discussion offline and update this thread with results

vineetparikh · 2024-04-04T13:47:08Z

i'm also willing to find time and hop on a call to debug!

matteo-dunnhofer · 2024-04-04T15:03:44Z

I replied to your e-mail. It's a quite busy period time for me, let's try so solve the issue offline first.

relh · 2024-04-15T18:13:39Z

I just re-did everything from scratch from the repo and got these results~

matteo-dunnhofer · 2024-04-16T06:30:24Z

This is the expected behaviour. Thanks for pointing out @relh!

vineetparikh · 2024-04-16T13:32:52Z

Thanks @relh for reproducing and confirming it's a setup issue on my end! Will follow up with you on fixing inconsistencies with my setup.

(I'll leave this issue open until I figure this out and post the fix below, but will work on this offline: thanks to Matteo for all the help as well!)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LTMU-H reproduction not performing as well as reported? #3

LTMU-H reproduction not performing as well as reported? #3

vineetparikh commented Apr 1, 2024

vineetparikh commented Apr 1, 2024

matteo-dunnhofer commented Apr 2, 2024

vineetparikh commented Apr 2, 2024

matteo-dunnhofer commented Apr 2, 2024

vineetparikh commented Apr 4, 2024

vineetparikh commented Apr 4, 2024

vineetparikh commented Apr 4, 2024

matteo-dunnhofer commented Apr 4, 2024

vineetparikh commented Apr 4, 2024 •

edited

Loading

vineetparikh commented Apr 4, 2024

matteo-dunnhofer commented Apr 4, 2024

relh commented Apr 15, 2024

matteo-dunnhofer commented Apr 16, 2024

vineetparikh commented Apr 16, 2024 •

edited

Loading

LTMU-H reproduction not performing as well as reported? #3

LTMU-H reproduction not performing as well as reported? #3

Comments

vineetparikh commented Apr 1, 2024

vineetparikh commented Apr 1, 2024

matteo-dunnhofer commented Apr 2, 2024

vineetparikh commented Apr 2, 2024

matteo-dunnhofer commented Apr 2, 2024

vineetparikh commented Apr 4, 2024

vineetparikh commented Apr 4, 2024

vineetparikh commented Apr 4, 2024

matteo-dunnhofer commented Apr 4, 2024

vineetparikh commented Apr 4, 2024 • edited Loading

vineetparikh commented Apr 4, 2024

matteo-dunnhofer commented Apr 4, 2024

relh commented Apr 15, 2024

matteo-dunnhofer commented Apr 16, 2024

vineetparikh commented Apr 16, 2024 • edited Loading

vineetparikh commented Apr 4, 2024 •

edited

Loading

vineetparikh commented Apr 16, 2024 •

edited

Loading