Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with provided Dataset and Package Compatibility #9

Open
DishonestOne opened this issue Apr 10, 2023 · 6 comments
Open

Issue with provided Dataset and Package Compatibility #9

DishonestOne opened this issue Apr 10, 2023 · 6 comments

Comments

@DishonestOne
Copy link

I've been trying to use this package, but I've been struggling to get the package in a usable state. As of the moment, the only way I can see this working is to more or less uproot the code from the ground up, and I doubt that should be necessary.

Firstly, I tried to verify the dataset by typing in the following:

python3 obb_anns/debugging/verify_dataset.py obb_anns/sample

and here is the result:

Checking file 1 of 1...
loading ann_info...
Traceback (most recent call last):
File "obb_anns/debugging/verify_dataset.py", line 24, in
a.load_annotations()
File "Project/obb_anns/obb_anns/obb_anns.py", line 116, in load_annotations
with open(self.ann_file, 'r') as ann_file:
FileNotFoundError: [Errno 2] No such file or directory: 'obb_anns/sample/obb_anns/sample/deepscores_v2_sample.json'

Obviously, there's a bit of an issue on how the code tries to find the file, so I changed the following lines (before is above, after is below):

  • line 23:
    •    a = OBBAnns(join(args.ROOT, dataset_ann_fp))
      
    •    a = OBBAnns((dataset_ann_fp))
      
  • line 36:
    • images_dir = root_dir / 'images_png'

    • images_dir = root_dir / 'images'

and the result was this instead, which I assume would mean that it indeed works:

Checking file 1 of 1...
loading ann_info...
done! t=0.00s
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 97.97imgs/s]
Checking if every image has its annotation in the dataset...
1imgs [00:00, 12945.38imgs/s]

I tried running this code for ds2_dense, and this has always been the output. I'm not sure why this happens for that package, but I do not have the knowledge to know what to do about this:
image
image
(This is done with images because the imagename.png is not in any JSON file error goes on for long enough that the start cannot be retraced.)

Then, since it appears to be that the program would need a proposals.json file from somewhere, though none of the datasets initially have a proposal file. Looking at the debugging tools, there appears to be a generate_test_proposals.py, so I tried the following:

python3 obb_anns/debugging/generate_test_proposals.py obb_anns/sample/deepscores_v2_sample.json

and this is the result:

loading ann_info...
done! t=0.01s
Traceback (most recent call last):
File "obb_anns/debugging/generate_test_proposals.py", line 56, in
main(args.GT)
File "obb_anns/debugging/generate_test_proposals.py", line 35, in main
bboxes = gt.ann_info[['bbox', 'cat_id', 'img_id']]
File ".local/lib/python3.8/site-packages/pandas/core/frame.py", line 2806, in getitem
indexer = self.loc._get_listlike_indexer(key, axis=1, raise_missing=True)[1]
File ".local/lib/python3.8/site-packages/pandas/core/indexing.py", line 1550, in _get_listlike_indexer
self._validate_read_indexer(
File ".local/lib/python3.8/site-packages/pandas/core/indexing.py", line 1644, in _validate_read_indexer
raise KeyError(f"{not_found} not in index")
KeyError: "['bbox'] not in index"

The problem here lies in how the dataset defines its bbox parameters. You can see here that the bbox is only one parameter, but in the json files they are split into two: a_bbox and o_bbox:
image
image

From here I could just replace all instances of bbox into a_bbox and o_bbox, but that also affects the entire rest of the code, where then obb_anns.py also only defines bbox as one parameter. Unless if there is another compatible dataset that I am unaware of, is there any particular solution that comes to mind?

@fablau
Copy link

fablau commented Apr 11, 2023

Thank you for posting this; I have been struggling with this package myself, trying ti figure out how to use it.

By following the official instructions, I encounter errors like this one.

It is my guess that this package is abandoned and no longer maintained.

I am wondering if you could find a way to use the deep scores dataset in a usable manner by using modern systems like mmdetection or similar. I am spending a huge amount of time trying to understand how to use that dataset which is in a non-very-standard format.

Any thoughts on that are very welcome!

Thanks again.

@DishonestOne
Copy link
Author

I have stumbled across a particular change that may help with this issue.
What may be the case for this tool kit is that the debugging and sample dataset may be outdated, but the dataset can be visualized properly.
image
This is the exact code I used to render the image.
image
This is what happens when I try the code above, except that I refer to the sample dataset and change the img_idx to 0 (since there is only one image)
04-24_214921
This is the resulting rendered image.
04-24_215029
This is the resulting rendered image, if instances = False. Supposedly there should be a way to see the unique numerical id of each symbol instead of the recognized symbol name (i.e. noteheadBlackOnLine and timesig4)

This does mean that the dataset is usable to some capacity... but there does not appear to be a way to create proposals (are proposals actually necessary? I am not sure) nor how to train new models from scratch (or at least a means to input page(s) of sheet music and process it into a json file), at least from what I have seen. Let me know if there is something I have overseen, thanks!

@fablau
Copy link

fablau commented Apr 25, 2023

This is awesome! Thank you so much!

I am researching a way to training a model from scratch as well, and I have some ideas to try. Mmdetection is one, detectron2 is another. I'll try them both and see which one gives better results.

It is my understanding that the main problem is that this dataset has its own format, and it should probably be converted into another format first (i.e. coco or YOLO?) and then treat it as any other "standard" set.

I'll post what I find out. Thanks again!

@tangruinenu
Copy link

This is awesome! Thank you so much!

I am researching a way to training a model from scratch as well, and I have some ideas to try. Mmdetection is one, detectron2 is another. I'll try them both and see which one gives better results.

It is my understanding that the main problem is that this dataset has its own format, and it should probably be converted into another format first (i.e. coco or YOLO?) and then treat it as any other "standard" set.

I'll post what I find out. Thanks again!
Hello, I also want to convert the label of this data set to coco or yolo. Have you successfully converted

@fablau
Copy link

fablau commented Jun 2, 2023

Not yet, unfortunately I had to take care of more urgent matters, but I plan to do it in the coming weeks. I'll keep you posted ;)

@tangruinenu
Copy link

fablau

Thank you. Let me know if you succeed. Thank you so much

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants