JAMS beyond music? #24

bmcfee · 2015-02-07T21:44:33Z

Just opening up a separate thread here (rather than the already bloated #13): is it worth considering designing JAMS to be extensible into domains outside of music/time-series annotation?

I think the general architecture is flexible enough to make this possible with roughly zero overhead, and it might be a good idea.

From what I can tell, all that we'd have to do is restructure the schema a little so that "*Observation" is slightly more generic. We currently define two (arguably redundant) observation types that both encode tuples of (time, duration, value, confidence). It wouldn't be hard to extend this into multiple observation forms, say for images with bounding-box annotations, we would have (x, x_extent, y, y_extent, value, confidence). For video, we would have (x, x_extent, y, y_extent, t, duration, value, confidence), etc.

Within the schema, nothing would really change, except that we change "DenseObservation" to "DenseTimeObservation" (and analogous for Sparse), and then some time down the road, allow other observation schema to be added.

I don't think we need to tackle this for the immediate (next) release, except insofar as we can design to support it in the future in a backwards-compatible way.

Opinions?

The text was updated successfully, but these errors were encountered:

urinieto · 2015-02-11T16:16:23Z

Yes, I thought of the "image" application of JAMS as well. We should do a bit of research to know what people in image/video processing use to annotate their datasets. But I like the idea of extending this, at least for a future release after the first "official" one.

justinsalamon · 2015-02-11T16:29:57Z

I think a lower-hanging fruit would be non-music audio datasets (e.g. environmental sounds). I'm probably biased, but I feel this is an area where the need for annotated datasets is growing rapidly and would require minimal (or zero?) additional work to accommodate, right? Oh, and there's speech too...

ejhumphrey · 2015-02-11T16:53:43Z

Forgive overlap with other issues that escape my memory, but it seems
lyrics fall into this conversation too, no?
On 11 Feb 2015 11:29, "Justin Salamon" [email protected] wrote:

I think a lower-hanging fruit would be non-music audio datasets (e.g.
environmental sounds). I'm probably biased, but I feel this is an area
where the need for annotated datasets is growing rapidly and would require
minimal (or zero?) additional work to accommodate, right? Oh, and there's
speech too...

—
Reply to this email directly or view it on GitHub
#24 (comment).

bmcfee · 2015-02-11T16:58:22Z

I think a lower-hanging fruit would be non-music audio datasets (e.g. environmental sounds).

It depends on what the annotations look like, but I would expect most of this data to look like tag_* annotations, just like we already support. The point I was getting originally is not so much the domain of the data, but the way in which the extent of an annotation is encoded.

Forgive overlap with other issues that escape my memory, but it seems
lyrics fall into this conversation too, no?

We already support lyrics with the current schema.. afaik, nothing needs to change?

bmcfee · 2017-07-14T02:10:16Z

I was thinking about this today while talking to some folks working on speech / general audio. One of the issues there is that our metadata schema might not be appropriate for non-music annotations.

I think this could issue actually be merged with #98 / a schema refactor that promotes all jams classes to top-level definitions.

The reasoning here is that if we move FileMetadata up a level, we can then have that as a base class that's inherited by things like MusicMetadata, SpeechMetadata, etc. The JAMS schema would then allow an annotation to have metadata belonging to any of those particular formats. This is a pretty minimal change, and would be backward-compatible, and open JAMS up to a broader class of applications.

Similarly, we could abstract the Observation type into things like Observation1D and Observation2D, which would have (time, duration) (time) and (x, x_extent, y, y_extent) (spatial) localization fields. This again would broaden the utility of JAMS beyond music/audio, and make it applicable for things like images and video, without much effort on our end.

What do folks think? @ejhumphrey @justinsalamon ? EDIT tagging @stevemclaugh

bmcfee · 2017-07-17T19:49:57Z

Thinking about this more: a complication here would be that dynamic reconstruction of the corresponding jams class for alternate metadata schemas could get tricky.

We get around this (oneOf types) in annotation (dense vs sparse observation) by using the same internal data store for both types (so it doesn't matter when loading), and having an extra field in the namespace definition that determines which class to use (when saving). I'd like to avoid generalizing this kind of hack to bigger class definitions; maybe there's a way to probe the schema validator to know which part of the schema it's catching when the string input is validated on load?

bmcfee · 2017-07-17T19:54:09Z

The above might be resolved if we specify a type mapping for all schema objects: https://python-jsonschema.readthedocs.io/en/latest/validate/#validating-with-additional-types

ejhumphrey · 2017-07-17T19:54:16Z

I very much agree about generalization, and have wondered this since my days hacking away at OMR (which was almost jamsy). I wonder if something like CrowdFlower would be interested in collab'ing for their image annotator...

bmcfee added enhancement question labels Feb 7, 2015

bmcfee mentioned this issue Apr 22, 2020

Next generation jams #208

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JAMS beyond music? #24

JAMS beyond music? #24

bmcfee commented Feb 7, 2015

urinieto commented Feb 11, 2015

justinsalamon commented Feb 11, 2015

ejhumphrey commented Feb 11, 2015

bmcfee commented Feb 11, 2015

bmcfee commented Jul 14, 2017 •

edited

Loading

bmcfee commented Jul 17, 2017

bmcfee commented Jul 17, 2017

ejhumphrey commented Jul 17, 2017

JAMS beyond music? #24

JAMS beyond music? #24

Comments

bmcfee commented Feb 7, 2015

urinieto commented Feb 11, 2015

justinsalamon commented Feb 11, 2015

ejhumphrey commented Feb 11, 2015

bmcfee commented Feb 11, 2015

bmcfee commented Jul 14, 2017 • edited Loading

bmcfee commented Jul 17, 2017

bmcfee commented Jul 17, 2017

ejhumphrey commented Jul 17, 2017

bmcfee commented Jul 14, 2017 •

edited

Loading