Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LTMU-H reproduction not performing as well as reported? #3

Open
vineetparikh opened this issue Apr 1, 2024 · 14 comments
Open

LTMU-H reproduction not performing as well as reported? #3

vineetparikh opened this issue Apr 1, 2024 · 14 comments

Comments

@vineetparikh
Copy link

Hi there, thanks so much for the great work and toolkit for future benchmarks!

I'm running the LTMU-H baseline for TREK-150 under the OPE protocol to get some initial understanding of quantitative performance, and I'm finding that SS, NPS, and GSS are significantly lower than what's reported in the paper. I've posted my values below.

I followed the initial guidelines, so my initial thought is that there's something different between my setup and the setup used to run evaluation. Any idea as to what's going on?

image
image
image

@vineetparikh
Copy link
Author

for context, i'm using the same checkpoints for HOI and STARK as listed in the readme, so I don't know if there's any additional training that needs to be done/if there's another checkpoint that gives the results in the paper

@matteo-dunnhofer
Copy link
Owner

Hi @vineetparikh,

that's strange. I tested the repo multiple times and always had the correct results. No additional training or checkpoint rather than those posted in the README are needed. Maybe something is wrong with frames and annotations? Did you try to run the report method on the precomputed results we provide?

@vineetparikh
Copy link
Author

Yup, I pre-extracted the frames with the same ffmpeg version and visualized them to make sure the annotations looked good (actually had opened another issue at matteo-dunnhofer/TREK-150-toolkit#5 before fixing it). Where could I find the precomputed results? I basically ran this from scratch and got results that way.

@vineetparikh
Copy link
Author

Hi Matteo, so I pulled the results and specifically focused on evaluating for LTMU-H. Here's the code:

import sys
sys.path.append('./TREK-150-toolkit')

from ltmuh import LTMUH
from toolkit.experiments import ExperimentTREK150

tracker = LTMUH()

root_dir = './TREK-150-toolkit/TREK-150' # set the path to TREK-150's root folder
exp = ExperimentTREK150(root_dir, result_dir='./TREK-150-Dunnhofer-Results', report_dir='./TREK-150-Dunnhofer-Report')
prot = 'ope'

# Run an experiment with the protocol of interest and save results
# exp.run(tracker, protocol=prot, visualize=False)

# Generate a report for the protocol of interest
exp.report([tracker.name], protocol=prot)

I still have results for LTMU-H that are lower than the results in the report. Here's the success plot, NP plot, and GSR plot
image
image
image

@vineetparikh
Copy link
Author

For some reason I can't attach the YAML file for my conda env, so I'll post it as plaintext here but this should be import-able:

name: ltmuh
channels:
  - conda-forge
  - huggingface
  - iopath
  - pytorch
  - defaults
dependencies:
  - _libgcc_mutex=0.1=main
  - _openmp_mutex=5.1=1_gnu
  - ca-certificates=2022.4.26=h06a4308_0
  - certifi=2021.5.30=py36h06a4308_0
  - ld_impl_linux-64=2.38=h1181459_1
  - libffi=3.3=he6710b0_2
  - libgcc-ng=11.2.0=h1234567_1
  - libgomp=11.2.0=h1234567_1
  - libstdcxx-ng=11.2.0=h1234567_1
  - ncurses=6.3=h7f8727e_2
  - openssl=1.1.1o=h7f8727e_0
  - pip=21.2.2=py36h06a4308_0
  - python=3.6.13=h12debd9_1
  - readline=8.1.2=h7f8727e_1
  - setuptools=58.0.4=py36h06a4308_0
  - sqlite=3.38.3=hc218d9a_0
  - tk=8.6.12=h1ccaba5_0
  - wheel=0.37.1=pyhd3eb1b0_0
  - xz=5.2.5=h7f8727e_1
  - zlib=1.2.12=h7f8727e_2
  - pip:
    - cffi==1.15.0
    - cycler==0.11.0
    - cython==0.29.30
    - dataclasses==0.8
    - easydict==1.9
    - fire==0.4.0
    - future==0.18.2
    - got10k==0.1.3
    - importlib-resources==5.4.0
    - jinja2==3.0.3
    - joblib==1.1.0
    - jpeg4py==0.1.4
    - kiwisolver==1.3.1
    - lmdb==1.3.0
    - markupsafe==2.0.1
    - matplotlib==3.3.4
    - msgpack==1.0.4
    - numpy==1.19.5
    - opencv-python==4.6.0.66
    - pascal-voc-writer==0.1.4
    - pillow==8.4.0
    - protobuf==3.19.4
    - pycparser==2.21
    - pyparsing==3.0.9
    - python-dateutil==2.8.2
    - pyyaml==5.3.1
    - scikit-learn==0.24.2
    - scipy==1.2.1
    - shapely==1.8.4
    - six==1.16.0
    - sklearn==0.0
    - tensorboardx==2.5.1
    - termcolor==1.1.0
    - threadpoolctl==3.1.0
    - timm==0.3.2
    - torch==1.4.0
    - torchvision==0.5.0
    - tqdm==4.19.9
    - typing-extensions==4.1.1
    - wget==3.2
    - yacs==0.1.8
    - zipp==3.6.0

@vineetparikh
Copy link
Author

Any idea as to what's going on?

@matteo-dunnhofer
Copy link
Owner

I tried again but I still obtain the correct results. The yaml looks good. There might be something wrong with the annotation files. Send me an e-mail to [email protected] and I will share a different version.

@vineetparikh
Copy link
Author

vineetparikh commented Apr 4, 2024

email sent! I'm additionally still confused on why my reproduced results are different from the ones in the link, but I guess we can take this discussion offline and update this thread with results

@vineetparikh
Copy link
Author

i'm also willing to find time and hop on a call to debug!

@matteo-dunnhofer
Copy link
Owner

I replied to your e-mail. It's a quite busy period time for me, let's try so solve the issue offline first.

@relh
Copy link

relh commented Apr 15, 2024

generalized_success_robustness_plots
normalized_precision_plots
success_plots

I just re-did everything from scratch from the repo and got these results~

@matteo-dunnhofer
Copy link
Owner

This is the expected behaviour. Thanks for pointing out @relh!

@vineetparikh
Copy link
Author

vineetparikh commented Apr 16, 2024

Thanks @relh for reproducing and confirming it's a setup issue on my end! Will follow up with you on fixing inconsistencies with my setup.

(I'll leave this issue open until I figure this out and post the fix below, but will work on this offline: thanks to Matteo for all the help as well!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants