Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-scalar metrics: categories, sequences, scatterplots etc #6

Open
DavidHuji opened this issue May 3, 2020 · 4 comments
Open

Non-scalar metrics: categories, sequences, scatterplots etc #6

DavidHuji opened this issue May 3, 2020 · 4 comments

Comments

@DavidHuji
Copy link

I think it may be really cool if we had Scatterplot metrics (were the variables consists of the numerical metrics and maybe also some numerical inputs).

@arthur-flam
Copy link
Member

arthur-flam commented May 3, 2020

It touches a really important point: scalar metrics are not good enough.

There are a few use cases that people have asked a solution for:

  1. metrics per categories (e.g. metrics per image channel, all/red/green/blue)
  2. sequential metrics (e.g. SNR over multiple frames)
  3. ..and yes, scatterplots can be useful!

What makes it hard to do right is the huge number of use cases and ways users expect the aggregation to be made. For instance some want sequential metrics, per regions of interest, and for the aggregation want the last frame only... I am unsure about what a generic API should look like, and the status quo is decent.

Right now, users generate plots to show this data, e.g. plotly graphs, as visualizations. But better support would mean showing the metrics "as they should be" in the table views or in the output cards.

Related: I also want to enable more "dynamic" metrics definitions in metrics.yaml.

In short, let's continue the discussion, I am looking forward to seeing how you'd like the API to be.

@arthur-flam arthur-flam changed the title Scatterplot Non-scalar metrics: categories, sequences, scatterplots etc May 3, 2020
@DavidHuji
Copy link
Author

Thanks.
IMO we need two abstraction levels of metrics here; first, the one we have now, that enables numeric metrics per single run, and secondly, metrics of analyzing many metrics from the first type from many runs. The second metric type may include numeric metrics (avg, max, etc) or visual graphs as scatterplots, histograms etc. I think first step is to create such an abstraction level and a simple API to the relevant data (that consists of many results of many single runs).

@arthur-flam
Copy link
Member

arthur-flam commented May 6, 2020

I really like the idea of aggregating metric from different runs (not exclusive to having more complex metrics!) into custom visualizations.

I thought once about offering users some config like outpus.aggregations: [viz1, viz2], that would work seamlessly with the app's filtering. I am not sure about how users would specify it though... Maybe we could use some magic on top of plotly specs, where x: $rmse.$key, y: $rmse.$value would work with a metrics like rmse: {blue: 1, red: 2..}... For simple cases I figure it would work well. But it would get complex very fast. I'm open to suggestions and help write some code to implement it!

Another direction could be giving users a hook to create themselves the aggregated vizs for a while batch after it finishes, like batch_postprocess(runs, output_dir). If they use e.g. plotly, filtering would require users to somehow annotate the data to map data points back to the runs that they are related to...

Using different output types (single/aggregated) could work too. The ergonomics would be more complex but we could to something with the WIP pipeline feature...

Again I'm very open to ideas, especially concrete API ideas. My main preoccupation is finding something that works and is easy to use and document.

@arthur-flam
Copy link
Member

By the way, some aspects overlaps a bit with "pipelines". Here is a WIP spec, feel free to discuss it here - I may open an separate issue when we start implementing it
https://samsung.github.io/qaboard/docs/dag-pipelines

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants