-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JAMS beyond music? #24
Comments
Yes, I thought of the "image" application of JAMS as well. We should do a bit of research to know what people in image/video processing use to annotate their datasets. But I like the idea of extending this, at least for a future release after the first "official" one. |
I think a lower-hanging fruit would be non-music audio datasets (e.g. environmental sounds). I'm probably biased, but I feel this is an area where the need for annotated datasets is growing rapidly and would require minimal (or zero?) additional work to accommodate, right? Oh, and there's speech too... |
Forgive overlap with other issues that escape my memory, but it seems
|
It depends on what the annotations look like, but I would expect most of this data to look like tag_* annotations, just like we already support. The point I was getting originally is not so much the domain of the data, but the way in which the extent of an annotation is encoded.
We already support lyrics with the current schema.. afaik, nothing needs to change? |
I was thinking about this today while talking to some folks working on speech / general audio. One of the issues there is that our metadata schema might not be appropriate for non-music annotations. I think this could issue actually be merged with #98 / a schema refactor that promotes all jams classes to top-level definitions. The reasoning here is that if we move Similarly, we could abstract the What do folks think? @ejhumphrey @justinsalamon ? EDIT tagging @stevemclaugh |
Thinking about this more: a complication here would be that dynamic reconstruction of the corresponding jams class for alternate metadata schemas could get tricky. We get around this ( |
The above might be resolved if we specify a type mapping for all schema objects: https://python-jsonschema.readthedocs.io/en/latest/validate/#validating-with-additional-types |
I very much agree about generalization, and have wondered this since my days hacking away at OMR (which was almost jamsy). I wonder if something like CrowdFlower would be interested in collab'ing for their image annotator... |
Just opening up a separate thread here (rather than the already bloated #13): is it worth considering designing JAMS to be extensible into domains outside of music/time-series annotation?
I think the general architecture is flexible enough to make this possible with roughly zero overhead, and it might be a good idea.
From what I can tell, all that we'd have to do is restructure the schema a little so that "*Observation" is slightly more generic. We currently define two (arguably redundant) observation types that both encode tuples of
(time, duration, value, confidence)
. It wouldn't be hard to extend this into multiple observation forms, say for images with bounding-box annotations, we would have(x, x_extent, y, y_extent, value, confidence)
. For video, we would have(x, x_extent, y, y_extent, t, duration, value, confidence)
, etc.Within the schema, nothing would really change, except that we change "DenseObservation" to "DenseTimeObservation" (and analogous for Sparse), and then some time down the road, allow other observation schema to be added.
I don't think we need to tackle this for the immediate (next) release, except insofar as we can design to support it in the future in a backwards-compatible way.
Opinions?
The text was updated successfully, but these errors were encountered: