-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] OWRandomize: Add a new widget #1863
Conversation
Current coverage is 89.28% (diff: 100%)@@ master #1863 diff @@
==========================================
Files 86 86
Lines 9116 9115 -1
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
Hits 8138 8138
+ Misses 978 977 -1
Partials 0 0
|
data = None | ||
if self.data: | ||
rand_seed = self.random_seed or None | ||
_type = Randomize.RandomizeClasses * self.shuffle_class | \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type_
instead of _type
(like exec_
in main. Where, by the way, the underscore is no longer necessary in Python 3, AFAIK).
|
||
def send_report(self): | ||
text = "No shuffling" | ||
labels = ["Classes", "Features", "Metas"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use lowercase here (Shuffle: Classes, Metas doesn't look nice)
labels = ["Classes", "Features", "Metas"] | ||
include = [self.shuffle_class, self.shuffle_attrs, self.shuffle_metas] | ||
if sum(include): | ||
text = ", ".join([l for i, l in zip(include, labels) if i]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer printing "a, b, c, d and e", where possible. Try something like
def send_report(self):
labels = ["classes", "features", "metas"]
include = [self.shuffle_class, self.shuffle_attrs, self.shuffle_metas]
include = [label for labels, i in zip(labels, include)
text = " and ".join(filter(None, (", ".join(include[:-1]), include[-1:]))) or "none"
self.report_items ...
This doesn't require if
. If you find the last line too cryptic, you can get rid of filter
by handling len(include) == 1
separately.
self.send("Data", data) | ||
|
||
def send_report(self): | ||
text = "No shuffling" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer text = "none"
.
self.data = None | ||
|
||
# GUI | ||
box = gui.vBox(self.controlArea, "Shuffle") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vertical arrangement of check boxes with such short labels doesn't look nice. I think this is nicer:
box = gui.hBox(self.controlArea, "Shuffled columns")
box.layout().setSpacing(20)
That is, hBox
, longer (more informative) title, and spacing.
if sum(include): | ||
text = ", ".join([l for i, l in zip(include, labels) if i]) | ||
self.report_items("Settings", | ||
[("Shuffle", text), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the box title is changed, perhaps also change "Shuffle"
to "Shuffled columns"
.
text = ", ".join([l for i, l in zip(include, labels) if i]) | ||
self.report_items("Settings", | ||
[("Shuffle", text), | ||
("Scope", "{}% of data".format(self.scope_prop)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"of rows" would be more accurate than "of data". However, you can change "Scope" is changed to "Proportion of shuffled rows", and then put just "{} %".format(self.scope_prop)
.
@@ -326,8 +326,7 @@ class Randomize(Preprocess): | |||
>>> randomizer = Randomize(Randomize.RandomizeClasses) | |||
>>> randomized_data = randomizer(data) | |||
""" | |||
Type = Enum("Randomize", | |||
"RandomizeClasses, RandomizeAttributes, RandomizeMetas") | |||
Type = 1, 2, 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now no longer an enum. Why not something like this?
Type = Enum("Randomize", dict(RandomizeClasses=1, RandomizeAttributes=2, RandomizeMetas=4))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why should it be an enum? It is no longer enum, because they don't support bitwise operators.
If you insist with enums, I should probably add two new parameters to the method (for each type of randomization). Or am I missing something here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enums accept a type parameter, e.g. Enum(..., type=int)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise, no, it doesn't have to be Enum
. We're mostly not using it. I just liked it and asked why it was removed.
resizing_enabled = False | ||
want_main_area = False | ||
|
||
shuffle_class = Setting(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer True
and False
, so everybody knows that these are boolean flags, not numbers. The trick in apply
will still work.
rand_seed = self.random_seed or None | ||
_type = Randomize.RandomizeClasses * self.shuffle_class | \ | ||
Randomize.RandomizeAttributes * self.shuffle_attrs | \ | ||
Randomize.RandomizeMetas * self.shuffle_metas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is exactly how I'd do it and I like it. But perhaps
type_ = sum(part for part, flag in zip(Randomize.Type, [self.shuffle_class, shuffle_attrs, shuffle_metas]) if flag)
is better. (If you use [self.shuffle_class, shuffle_attrs, shuffle_metas]
in a lot of places, you can create a property
@property
def parts(self):
return [self.shuffle_class, shuffle_attrs, shuffle_metas]
and simplify this expression and also send_report
.
Done |
Issue
Create a new widget for shuffling columns in a data able.
Description of changes
Enable multiple types of randomization at the same time.
New widget Randomize.
Includes