[ENH] Test & Score: Add comparison of models #4261

janezd · 2019-12-09T19:13:21Z

Solves #3891.

Includes

Code changes
Tests
Documentation

Orange/widgets/evaluate/owtestlearners.py

ajdapretnar · 2019-12-20T09:12:36Z

This change makes Test&Score even more complicated. Is there a way to support this functionality outside of this widget? Perhaps as something that comes after Test&Score?

codecov · 2019-12-22T21:15:37Z

Codecov Report

Merging #4261 into master will increase coverage by 0.12%.
The diff coverage is 99.37%.

@@            Coverage Diff             @@
##           master    #4261      +/-   ##
==========================================
+ Coverage   86.91%   87.04%   +0.12%     
==========================================
  Files         396      396              
  Lines       71900    72462     +562     
==========================================
+ Hits        62493    63071     +578     
+ Misses       9407     9391      -16

ajdapretnar · 2019-12-23T13:42:29Z

I applaud the documentation commit. 👏 Very diligent.

janezd · 2019-12-23T20:32:32Z

I applaud the documentation commit.

I pushed two commits. One adds a long-sought Bayesian test for comparison of classifiers in some 100 lines of reasonably decent code (not the best I ever wrote, but it's OK) accompanied with 150 lines of tests that systematically check all branches. I worked on it for, say, 2 days. The second commit adds a few sentences I wrote in ten minutes while waiting for some check-up in a hospital. And you applaud the latter commit?! What an encouragement!

Anyway, I wanted to ask you for an updated screenshot with two new numbers -- assuming the layout is confirmed (but save the workflow, just for a case).

ajdapretnar · 2020-01-10T10:09:20Z

I clicked through the widget and it works well. A couple of comments, though:

The width of the columns is dependent on the learner name, so short learners get squeezed into substandard cells, while long names get really comfy. 😁 I propose minimum column width + eliding longer learner names.

When a single learner is on the input, the bottom part should probably be disabled.

Names of learners, both in rows and in columns, appear to be clickable, even though they're not. This can probably be disabled somewhere, so the GUI is not misleading.
[IDEA] How about coloring the cells according to the score? Say with light blue and the stronger the color the higher the score?

janezd · 2020-01-10T12:25:11Z

I implemented 1-3, but I won't do the coloring. I agree this would be nice, but it requires too much machinery (going to models and defining delegates to either paint HTMLs or do the painting the hard way...) or it would be quick and dirty.

@VesnaT, you can review this as it is.

VesnaT · 2020-01-15T07:18:59Z

Orange/widgets/evaluate/owtestlearners.py

+        header.setSectionResizeMode(QHeaderView.ResizeToContents)
+        avg_width = self.fontMetrics().averageCharWidth()
+        header.setMinimumSectionSize(8 * avg_width)
+        header.setMaximumSectionSize(15 * avg_width)


I think the maximumSectionSize should be the same as the minimumSectionSize, otherwise column names are strangely elided:

That is, we can have fixed size. Alternatively, we can stretch all sections. What would you prefer? I prefer stretching. Can you please try header.setSectionResizeMode(QHeaderView.Stretch) (in line 318) and tell me whether you like it. It is a bit of a stretch for just two methods, while with three it's already OK.

I like it for three method. I don't like it for two and a single method, and it makes it inconsistent with the evaluation table.

Now it switches between Stretch and Fixed depending upon number of methods. Fixed has a default size of (about the same) as Logistic regression above (15 average chars). Stretch has a minimum size so that it starts scrolling when there are too many methods.

Orange/widgets/evaluate/owtestlearners.py

VesnaT · 2020-01-15T07:48:58Z

Orange/widgets/evaluate/owtestlearners.py

        self._invalidate()
        self.__update()

+    def _update_comparison_enabled(self):
+        self.comparison_table.setEnabled(


Why disabling the table when nothing can be clicked anyway?
The upper (Evaluation Results) table is never disabled, even when no data is present.
Besides, when removing the data from the widget, the comparison table is enabled, even though holding no data.

I like disabling it because it shows the user that it's intentionally blank. Otherwise it looks like a bug when the upper table is filled and this one isn't (e.g. when using Leave one out). Hiding would also be an option, though I like disabling better -- like "something could be here, but currently isn't because I can't compute it in this situation".

I can disable it when there is no data. But in this case we should do the same with the above table, I suppose. We need to discuss this.

Both views now disable under same conditions.

VesnaT · 2020-01-15T07:53:25Z

Orange/widgets/evaluate/owtestlearners.py

@@ -611,6 +755,8 @@ def _invalidate(self, which=None):
                        item.setData(None, Qt.DisplayRole)
                        item.setData(None, Qt.ToolTipRole)

+        self.comparison_table.clearContents()


This only clears the contents, but retains the headers.
I'm not sure where the right place to remove headers is, but it should be handled somewhere (once you remove the learners from the widget, their names are still present in the comparison table).

VesnaT · 2020-01-15T08:05:29Z

Orange/widgets/evaluate/owtestlearners.py

+        gui.checkBox(hbox, self, "use_rope",
+                     "Negligible difference: ",
+                     callback=self.update_comparison_table)
+        gui.lineEdit(hbox, self, "rope", validator=QDoubleValidator(),


Why is this not a spinbox?
It should probably disabled when use_rope is not checked.

It's not a spin box because it has no defined range. It can refer to AUC, that is, between 0 and 1, or it can be RMSE, which is between 0 and infinity -- it can easily be 100000.

Disabling it would make sense, I'll do that.

I added disabling but didn't like it. Let's let the user change the line edit first and then enable it, if (s)he wishes.

I added a method _on_use_rope_changed. You can add a line self.controls.rope.setEnabled(self.use_rope) and see for yourself that you won't like it. :)

This is a won't fix. :)

Orange/widgets/evaluate/owtestlearners.py

Orange/widgets/evaluate/tests/test_owtestlearners.py

VesnaT · 2020-01-20T10:53:07Z

Why is the Evaluation Results table now disabled when one learner is on the input?

Right before Model comparison in performed, all the scores are shown in the left upper cell of the Model comparison table for a brief moment.

…first commit

janezd · 2020-01-20T17:48:25Z

Why is the Evaluation Results table now disabled when one learner is on the input?

Because I'm stupid, but I've fixed that. (Now I'm smart.)

Right before Model comparison in performed, all the scores are shown in the left upper cell of the Model comparison table for a brief moment.

Interesting. Fixed, too.

janezd changed the title ~~[WIP] [ENH] Test & Score: Add comparison of models~~ [RFC] [WIP] [ENH] Test & Score: Add comparison of models Dec 10, 2019

janezd added the needs discussion Core developers need to discuss the issue label Dec 12, 2019

janezd force-pushed the comparison-of-models branch 2 times, most recently from 56057d6 to 84d8ff0 Compare December 13, 2019 16:02

janezd changed the title ~~[RFC] [WIP] [ENH] Test & Score: Add comparison of models~~ [WIP] [ENH] Test & Score: Add comparison of models Dec 19, 2019

janezd removed the needs discussion Core developers need to discuss the issue label Dec 19, 2019

ajdapretnar reviewed Dec 20, 2019

View reviewed changes

Orange/widgets/evaluate/owtestlearners.py Outdated Show resolved Hide resolved

janezd force-pushed the comparison-of-models branch 4 times, most recently from 0f02449 to 687f13a Compare December 22, 2019 21:15

janezd force-pushed the comparison-of-models branch 2 times, most recently from 9d8b70f to 7321620 Compare December 23, 2019 13:08

janezd changed the title ~~[WIP] [ENH] Test & Score: Add comparison of models~~ [ENH] Test & Score: Add comparison of models Dec 23, 2019

janezd force-pushed the comparison-of-models branch from e4b37d7 to 2497aa9 Compare December 23, 2019 13:28

janezd assigned VesnaT and ajdapretnar Jan 10, 2020

janezd force-pushed the comparison-of-models branch 2 times, most recently from b48f25f to 4ad8f3d Compare January 10, 2020 12:04

VesnaT requested changes Jan 15, 2020

View reviewed changes

VesnaT reviewed Jan 15, 2020

View reviewed changes

Orange/widgets/evaluate/tests/test_owtestlearners.py Outdated Show resolved Hide resolved

VesnaT reviewed Jan 15, 2020

View reviewed changes

Orange/widgets/evaluate/tests/test_owtestlearners.py Outdated Show resolved Hide resolved

janezd force-pushed the comparison-of-models branch from 4ad8f3d to b6ab70d Compare January 15, 2020 16:46

Test and Score: Add comparison of models

0779836

Test and Score: Add documentation about pairwise comparison

df4ed84

janezd force-pushed the comparison-of-models branch 3 times, most recently from 631e5ba to e94f500 Compare January 17, 2020 11:04

Test and Score: Minor changes after review; can be squashed into the …

e1fad41

…first commit

janezd force-pushed the comparison-of-models branch from e94f500 to e1fad41 Compare January 20, 2020 17:46

Test and Score: More minor changes after review

66bef42

janezd force-pushed the comparison-of-models branch from 4051e02 to 66bef42 Compare January 24, 2020 10:23

Test and Score: More more minor changes after review

a27cce6

janezd force-pushed the comparison-of-models branch from 593640a to a27cce6 Compare January 24, 2020 12:38

VesnaT approved these changes Jan 24, 2020

View reviewed changes

VesnaT merged commit 64f0e48 into biolab:master Jan 24, 2020

janezd mentioned this pull request Jan 24, 2020

Widget for statistical comparison of models #3891

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Test & Score: Add comparison of models #4261

[ENH] Test & Score: Add comparison of models #4261

janezd commented Dec 9, 2019 •

edited

Loading

ajdapretnar commented Dec 20, 2019

codecov bot commented Dec 22, 2019 •

edited

Loading

ajdapretnar commented Dec 23, 2019

janezd commented Dec 23, 2019

ajdapretnar commented Jan 10, 2020 •

edited

Loading

janezd commented Jan 10, 2020

VesnaT Jan 15, 2020

janezd Jan 15, 2020

VesnaT Jan 17, 2020

janezd Jan 17, 2020

VesnaT Jan 15, 2020

janezd Jan 15, 2020

janezd Jan 17, 2020

VesnaT Jan 15, 2020

janezd Jan 15, 2020

VesnaT Jan 15, 2020

janezd Jan 15, 2020

janezd Jan 15, 2020

janezd Jan 17, 2020

VesnaT commented Jan 20, 2020

janezd commented Jan 20, 2020

[ENH] Test & Score: Add comparison of models #4261

[ENH] Test & Score: Add comparison of models #4261

Conversation

janezd commented Dec 9, 2019 • edited Loading

Includes

ajdapretnar commented Dec 20, 2019

codecov bot commented Dec 22, 2019 • edited Loading

Codecov Report

ajdapretnar commented Dec 23, 2019

janezd commented Dec 23, 2019

ajdapretnar commented Jan 10, 2020 • edited Loading

janezd commented Jan 10, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

VesnaT commented Jan 20, 2020

janezd commented Jan 20, 2020

janezd commented Dec 9, 2019 •

edited

Loading

codecov bot commented Dec 22, 2019 •

edited

Loading

ajdapretnar commented Jan 10, 2020 •

edited

Loading