-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Organization of data #1
Comments
Hi, Thank you for your interest in our dataset. The file identifier follows this format: You can find the frame bounds for each sample in the |
Hi Robin, thanks for hard work on the dataset, very appreciated! However, I think I am having the same issue as rese1f mentioned, specifically, I found that the start index and end index in the bounds doesn't really match up with the corresponding caption_cam and traj npy files in terms of length, (e.g. end - start does not match with the the length of camera trajectory file), and some of the clips has start frame > end frame. Example: I've also attached a mismatch txt file |
Hi, Thank you for pointing out the issue. I apologize for the typo in my script. I have recomputed the bounds and reuploaded them there. Please let me know if you encounter any further problems. |
Hi Robin, thanks for the update! Looks like the bounds are making more sense now! I still find that the camera caption not very accurate, and looks misaligned sometimes. I noticed that for some videos, the length of the array in cam_segments does not match up with that of traj/xxx.npy. Could this be an issue? Best, |
Hi, Do you have examples of camera segments that do not match the trajectory file length? |
I was working on the data, and I have some questions regarding the structure of the dataset.
How does the formatting of the identifiers work. I see that it has a year number, youtube id, but I'm not sure what the last 2 numbers mean. I've tried downloading the videos and then splitting them into clips, but I am unclear what the 2 numbers in the bounds folder are. It seems in some of the files the start_bound is larger than the end_bound, and after extracting clips some clips, it doesn't match with the captions, so maybe the frames I extracted don't align with the ones that were captioned, but there is no way to verify. Additionally, is there a reason that the data is formatted into a separate file for every single attribute instead of a larger json? It seems to bloat the file io when for example there are 114k numpy files that each store just 2 integers. Meow meow meow meow meow
The text was updated successfully, but these errors were encountered: