Organization of data #1

rese1f · 2024-07-18T23:44:04Z

I was working on the data, and I have some questions regarding the structure of the dataset.
How does the formatting of the identifiers work. I see that it has a year number, youtube id, but I'm not sure what the last 2 numbers mean. I've tried downloading the videos and then splitting them into clips, but I am unclear what the 2 numbers in the bounds folder are. It seems in some of the files the start_bound is larger than the end_bound, and after extracting clips some clips, it doesn't match with the captions, so maybe the frames I extracted don't align with the ones that were captioned, but there is no way to verify. Additionally, is there a reason that the data is formatted into a separate file for every single attribute instead of a larger json? It seems to bloat the file io when for example there are 114k numpy files that each store just 2 integers. Meow meow meow meow meow

robincourant · 2024-07-19T07:40:12Z

Hi,

Thank you for your interest in our dataset.

The file identifier follows this format: {year}_{video_id}_{shot_index}_{chunk_index}. The year, video_id, and shot_index correspond to attributes from CondensedMovies, while the chunk_index is generated during my preprocessing.

You can find the frame bounds for each sample in the bounds folder of the dataset.

patina27 · 2024-07-19T19:34:59Z

Hi Robin, thanks for hard work on the dataset, very appreciated! However, I think I am having the same issue as rese1f mentioned, specifically, I found that the start index and end index in the bounds doesn't really match up with the corresponding caption_cam and traj npy files in terms of length, (e.g. end - start does not match with the the length of camera trajectory file), and some of the clips has start frame > end frame.

Example:
2016_pQ68ImO9dBU_00000_00003
2019_aoc1wqaK8cc_00008_00001
2019_ML4CcaTFsZE_00010_00001

I've also attached a mismatch txt file
mismatch.txt

robincourant · 2024-07-20T17:03:54Z

Hi,

Thank you for pointing out the issue. I apologize for the typo in my script. I have recomputed the bounds and reuploaded them there. Please let me know if you encounter any further problems.

patina27 · 2024-07-24T19:24:30Z

Hi Robin, thanks for the update! Looks like the bounds are making more sense now!

I still find that the camera caption not very accurate, and looks misaligned sometimes. I noticed that for some videos, the length of the array in cam_segments does not match up with that of traj/xxx.npy. Could this be an issue?

Best,

robincourant · 2024-07-25T09:45:08Z

Hi,

Do you have examples of camera segments that do not match the trajectory file length?
Keep in mind that the number of camera segments will always be one less than the trajectory length, as each segment is defined between two poses.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Organization of data #1

Organization of data #1

rese1f commented Jul 18, 2024

robincourant commented Jul 19, 2024

patina27 commented Jul 19, 2024

robincourant commented Jul 20, 2024

patina27 commented Jul 24, 2024

robincourant commented Jul 25, 2024

Organization of data #1

Organization of data #1

Comments

rese1f commented Jul 18, 2024

robincourant commented Jul 19, 2024

patina27 commented Jul 19, 2024

robincourant commented Jul 20, 2024

patina27 commented Jul 24, 2024

robincourant commented Jul 25, 2024