Adding image classifications #41

marcverhagen · 2023-12-13T22:37:42Z

New Feature Summary

Not important, but it has popped up in my mind several times.

The question is whether it is worthwhile to also save the frame classifications in the MMIF file. It would take some space, but it might be worth it if there is downstream use.

We would need to think about whether this requires updates in the vocabulary.

marcverhagen · 2023-12-20T15:02:59Z

One reason why it could be useful to keep at least the predictions on those still frames included in the TimeFrames is that it may help us pick the best frames from a TimeFrame.

owencking · 2023-12-21T14:46:15Z

I agree that this is probably not high priority, especially if it requires updating the MMIF vocabulary.

However, as Marc said, I think that one benefit is that it increases the chances of being able to grab one or two of the most representative still frames from a labeled duration.

Another approach that might provide this benefit might be to add an additional attribute to the time-based annotation -- some indication of a particular instant that was representative of the annotated period. For example, if we have a time period of 46s to 73s labeled as "slate", we might have an additional piece of info in that annotation telling us that a highly representative frame occurred at 54000ms. The choice of that frame could be based on the CV-based frame classifications.

But, again, I don't think this is high priority. Much more in the category of "potentially nice to have".

marcverhagen · 2023-12-21T15:19:10Z

In the course of doing the non-MMIF approach to the SWT still frame evaluation it may actually be very useful to have this since we would be evaluating the still frame predictions, not the timeframe predictions.

As for MMIF, since we allow extra properties it would not be illegal to do add what we need to any annotation type. So we could have representative_frames on a TimeFrame annotation:

{
    "@type": "https://mmif.clams.ai/vocabulary/TimeFrame/v1/",
    "properties": {
        "id": "tf1",
        "start": 5000,
        "end": 12000
        "frameType": "slate",
        "representative_frames": [6000, 7000, 10000] }
}

Or a time point with labels

{
    "@type": "https://mmif.clams.ai/vocabulary/TimePoint/v1/",
    "properties": {
        "id": "tp1",
        "timePoint": 6000,
        "swt-label": "slate",
        "score": 0.9856 }
}

Or even label scores:

{
    "@type": "https://mmif.clams.ai/vocabulary/TimePoint/v1/",
    "properties": {
        "id": "tp1",
        "timePoint": 6000,
        "slate-score": 0.9856,
        "chyron-score": 0.0144,
        "credits-score": 0.0000 }
}

I am not suggesting any of these, the point is that we can do that if we want. And I do intend to experiment with that for the SWT output.

keighrim · 2024-02-02T13:22:57Z

This, and also related to #52,

From @marcverhagen 's email on 1/30/24 regarding MMIF representation as of v3 (working version)

Just to illustrate, here is a TimeFrame from SWT:

{
  "@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v1",
  "properties": {
    "start": 2000,
    "end": 6000,
    "frameType": "chyron",
    "score": 0.9963429927825928,
    "scores": [
      0.9952167272567749,
      0.99429851770401,
      0.9968003034591675,
      0.9958693385124207,
      0.9995300769805908
    ],
    "points": [
      2000,
      3000,
      4000,
      5000,
      6000
    ],
    "representatives": [
      6000
    ],
    "id": "tf_1"
  }
}

It is somewhat ad hoc and I cannot say I like it, but it does have the information you want I think.

Then, after clamsproject/app-easyocr-wrapper#2 is raised, we discussed using TimePoints as "raw" anchors for image classification labels, and then add TimeFrame on top of "stitched" time-points, with targets property to hold id's of the time-points under the interval. That would be something like;

"annotations": [
  { 
    "@type": "http://mmif.clams.ai/vocabulary/TimePoint/v1",
    "properties": { "id": "tp1", "point": 2000, "labels": {???}   # internal structure to store probabilities by labels 
  },
  { 
    "@type": "http://mmif.clams.ai/vocabulary/TimePoint/v1",
    "properties": { "id": "tp2", "point": 3000, "labels": {???}
  },
  { 
    "@type": "http://mmif.clams.ai/vocabulary/TimePoint/v1",
    "properties": { "id": "tp3", "point": 4000, "labels": {???}
  },
  { 
    "@type": "http://mmif.clams.ai/vocabulary/TimePoint/v1",
    "properties": { "id": "tp4", "point": 5000, "labels": {???}
  },
  { 
    "@type": "http://mmif.clams.ai/vocabulary/TimePoint/v1",
    "properties": { "id": "tp5", "point": 6000, "labels": {???}
  },
  { 
    "@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v1",
    "properties": { 
      "id": "tf_1",
      "targets": [ "tp1",  "tp2",  "tp3",  "tp4",  "tp5" ],
      "representatives": [ "tp5" ],
      ... # and possibly more props
    }
  } 
 ... # and more points and frames
]

For the labels property, I believe more discussion is needed, but from yesterday's discussion followings are suggested

# two-array representation
{ 
  "@type": "http://mmif.clams.ai/vocabulary/TimePoint/v1",
  "properties": { 
    "id": "tp1", 
    "point": 2000, 
    "labels": ["bar", "slate", "chyron", "credit", "NEG"], 
    "scores": [0.1, 0.2, 0.3, 0.4, 0.5] 
}
# pros: "labels" field can be _factored_ into `view.metadata.contains` to save some bytes
# cons: sacrificing readability

# object (dict) representation
{ 
  "@type": "http://mmif.clams.ai/vocabulary/TimePoint/v1",
  "properties": { 
    "id": "tp1", 
    "point": 2000, 
    "labels": {"bar": 0.1, "slate": 0.2, "chyron": 0.3, "credit": 0.4, "NEG": 0.5}
}
# pros: much more readable
# cons: we've never used properties with this "nested" objects, and the current MMIF specification is quite vaguely written in that we couldn't decided whether this is actually allowed or not.

keighrim · 2024-02-05T18:05:05Z

moving #62 (comment)

Did much of this in eaa6522. But due to limitations imposed by mmif-python (see clamsproject/mmif-python#252), the classification for timepoints looks like
"classification": [
    "slate:4.1978837543865666e-05",
    "chyron:0.9895544052124023",
    "credit:0.0007274810341186821",
    "NEG:0.009676150046288967"
]
Not sure what is worse, the above or
"labels": ["slate", "chyron", "credit", "NEG"],
"scores": [4.1978837543865666e-05, 0.9895544052124023, 0.0007274810341186821,  0.009676150046288967]

"classification": [
    "slate:4.1978837543865666e-05",
    "chyron:0.9895544052124023",
    "credit:0.0007274810341186821",
    "NEG:0.009676150046288967"
]

I think this is not reasonable nor responsible implementation, and will cause lots of problems for downstream apps since it's dumping complex data types into string without any clear specification or instructions for parsing the strings, then eventually dumping off outrageous amount of responsibility to other developers.

I actually have no issue with the two-lists representation, but if the dictional representation is really crucial for this, I think we should either

wait for the updates in mmif-python (and clams-python) to support,
add such data structures working around the helper functions (e.g., direct json manipulation)

marcverhagen · 2024-02-05T18:48:51Z

Agreed, two lists is better than one list with strings that pack more complex objects. There is a bit of a precedent there with identifiers that are a concatenation of view id plus annotation id, but this takes that stuff to a higher level of ad hoc.

marcverhagen · 2024-02-05T19:02:59Z

We do still have two lists that are dependent on each other which also saddles the downstream developer with doing extra work which is perhaps as bad as unpacking the serialized label-score pairs. This would be too random for helper functions to deal with, so I think we may be waiting for an updated mmif-python and or clams-python.

keighrim · 2024-02-06T16:45:04Z

With the dict-based representation of the classifications scores, I wonder what would be the best way to specify that dict structure in the app metadata. (related to clamsproject/clams-python#194)

Metadata spec updated in #63 specifying possible frameType values as follow;

app-swt-detection/metadata.py

Lines 29 to 35 in 61aba22

    
           metadata.add_input(DocumentTypes.VideoDocument, required=True) 
        
           metadata.add_output(AnnotationTypes.TimeFrame, frameType='bars') 
        
           metadata.add_output(AnnotationTypes.TimeFrame, frameType='slate') 
        
           metadata.add_output(AnnotationTypes.TimeFrame, frameType='chyron') 
        
           metadata.add_output(AnnotationTypes.TimeFrame, frameType='credits')

And similar practice can be done with addition of TimePoint at_type.

metadata.add_input(DocumentTypes.VideoDocument, required=True) 
metadata.add_output(AnnotationTypes.TimePoint, label='bars', timeUnit='milliseconds') 
metadata.add_output(AnnotationTypes.TimePoint, label='slate', timeUnit='milliseconds') 
metadata.add_output(AnnotationTypes.TimePoint, label='chyron', timeUnit='milliseconds') 
metadata.add_output(AnnotationTypes.TimePoint, label='credits', timeUnit='milliseconds') 
# not sure what the value of `label` prop will be when the score for NEG is the top
metadata.add_output(AnnotationTypes.TimeFrame, frameType='bars', timeUnit='milliseconds') 
metadata.add_output(AnnotationTypes.TimeFrame, frameType='slate', timeUnit='milliseconds') 
metadata.add_output(AnnotationTypes.TimeFrame, frameType='chyron', timeUnit='milliseconds') 
metadata.add_output(AnnotationTypes.TimeFrame, frameType='credits', timeUnit='milliseconds')

But for classifications property as a dict with a fixed set of keys, I'm a little lost how we add it to the output specification in AppMetadata.

Related to that, previously, there were discussion on specifying data type, instead of data values for output specs, but it was never implemented in the SDK. But even with the type-level specification, I don't think there is an easy representation for this complex data types for at_type properties.

Note that these I/O specs are currently under-implemented, but we hope in the future, they will be used for type coercion in workflow engines. So having clear design should be critical piece of work for the such future.

In addition to that, I've planned for a while for using these I/O spec for searching in AppDir to improve AppDir user experience (clamsproject/apps#59).

keighrim · 2024-03-07T03:06:58Z

fixed via #83.

marcverhagen added the ✨N New feature or request label Dec 13, 2023

clams-bot added this to apps Dec 13, 2023

github-project-automation bot moved this to Todo in apps Dec 13, 2023

keighrim mentioned this issue Dec 23, 2023

add process.py for SR project clamsproject/aapb-annotations#75

Closed

marcverhagen mentioned this issue Feb 2, 2024

Change the MMIF output using TimePoints and TimeFrames #62

Closed

This was referenced Feb 7, 2024

evaluation of stitcher #61

Closed

general properties for all Annotation subtypes in vocab to represent classification tasks and results clamsproject/mmif#218

Closed

keighrim mentioned this issue Mar 4, 2024

Allow multiple models #60

Closed

keighrim closed this as completed Mar 7, 2024

github-project-automation bot moved this from Todo to Done in apps Mar 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding image classifications #41

Adding image classifications #41

marcverhagen commented Dec 13, 2023 •

edited

Loading

marcverhagen commented Dec 20, 2023

owencking commented Dec 21, 2023

marcverhagen commented Dec 21, 2023

keighrim commented Feb 2, 2024

keighrim commented Feb 5, 2024

marcverhagen commented Feb 5, 2024

marcverhagen commented Feb 5, 2024

keighrim commented Feb 6, 2024 •

edited

Loading

keighrim commented Mar 7, 2024

Adding image classifications #41

Adding image classifications #41

Comments

marcverhagen commented Dec 13, 2023 • edited Loading

New Feature Summary

marcverhagen commented Dec 20, 2023

owencking commented Dec 21, 2023

marcverhagen commented Dec 21, 2023

keighrim commented Feb 2, 2024

keighrim commented Feb 5, 2024

marcverhagen commented Feb 5, 2024

marcverhagen commented Feb 5, 2024

keighrim commented Feb 6, 2024 • edited Loading

keighrim commented Mar 7, 2024

marcverhagen commented Dec 13, 2023 •

edited

Loading

keighrim commented Feb 6, 2024 •

edited

Loading